Scikit-Learn : Linear Regression

Scikit-Learn : Linear Regression

In this topic we will discuss Linear Regression in Scikit-Learn. It is one of the best statistical models that studies the relationship between a dependent variable (Y) with a given set of independent variables (X). The relationship can be established with the help of fitting a best line.

sklearn.linear_model.LinearRegression is the module used to implement linear regression.

Parameters

Following table consists the parameters used by Linear Regression module βˆ’

Sr.NoParameter & Description
1fit_intercept βˆ’ Boolean, optional, default TrueUsed to calculate the intercept for the model. No intercept will be used in the calculation if this set to false.
2normalize βˆ’ Boolean, optional, default FalseIf this parameter is set to True, the regressor X will be normalized before regression. The normalization will be done by subtracting the mean and dividing it by L2 norm. If fit_intercept = False, this parameter will be ignored.
3copy_X βˆ’ Boolean, optional, default TrueBy default, it is true which means X will be copied. But if it is set to false, X may be overwritten.
4n_jobs βˆ’ int or None, optional(default = None)It represents the number of jobs to use for the computation.

Attributes

Following table consists the attributes used by Linear Regression module βˆ’

Sr.NoAttributes & Description
1coef_ βˆ’ array, shape(n_features,) or (n_targets, n_features)It is used to estimate the coefficients for the linear regression problem. It would be a 2D array of shape (n_targets, n_features) if multiple targets are passed during fit. Ex. (y 2D). On the other hand, it would be a 1D array of length (n_features) if only one target is passed during fit.
2Intercept_ βˆ’ arrayThis is an independent term in this linear model.

Implementation Example

First, import the required packages βˆ’

import numpy as np
from sklearn.linear_model import LinearRegression

Now, provide the values for independent variable X βˆ’

X = np.array([[1,1],[1,2],[2,2],[2,3]])

Next, the value of dependent variable y can be calculated as follows βˆ’

y = np.dot(X, np.array([1,2])) + 3

Now, create a linear regression object as follows βˆ’

regr = LinearRegression(
   fit_intercept = True, normalize = True, copy_X = True, n_jobs = 2
)
.fit(X,y)

Use predict() method to predict using this linear model as follows βˆ’

regr.predict(np.array([[3,5]]))

Output

array([16.])

Example

To get the coefficient of determination of the prediction we can use Score() method as follows βˆ’

regr.score(X,y)

Output

1.0

Example

We can estimate the coefficients by using attribute named β€˜coef’ as follows βˆ’

regr.coef_

Output

array([1., 2.])

Example

We can calculate the intercept i.e. the expected mean value of Y when all X = 0 by using attribute named β€˜intercept’ as follows βˆ’

In [24]: regr.intercept_
Output
3.0000000000000018

Complete code of implementation example

import numpy as np
from sklearn.linear_model import LinearRegression
X = np.array([[1,1],[1,2],[2,2],[2,3]])
y = np.dot(X, np.array([1,2])) + 3
regr = LinearRegression(
   fit_intercept = True, normalize = True, copy_X = True, n_jobs = 2
).fit(X,y)
regr.predict(np.array([[3,5]]))
regr.score(X,y)
regr.coef_
regr.intercept_

Next Topic : Click Here

This Post Has One Comment

Leave a Reply