The Normal Equation for Linear Regression in Matrix Form
In this tutorial I will go through an simple example implementing the normal equation for linear regression in matrix form. The iPython notebook I used to generate this post can be found on Github.
The primary focus of this post is to illustrate how to implement the normal equation without getting bogged down with a complex data set. To that end I have chosen a simple, albeit contrived, dataset.
import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt
Setting and Prepping our Data
First we will create a pandas DataFrame and populate it with our data. This sample data will have only 3 examples, each consisting of one feature x0, and a corresponding target y. Note: Pandas support IO for a wide collection of file types so you are free to read any data set you’d like from an external source.
trainingData = pd.DataFrame(data=[[1,1], [2,2], [4,4]], columns=['x1', 'y'])
trainingData.insert(0, 'x0', np.ones(3))
Now that we have the data lets plot (using seaborn) it to get some intuition for what hypothesis function might be.
with sns.axes_style("darkgrid"): g = sns.lmplot('x1', 'y', trainingData[['x1', 'y']], markers='o', fit_reg=False) g.set(ylim=(0, None)) g.set(xlim=(0, None))
From the above plot it is fairly easy to see that the hypothesis should be linear, and of the form:
Furthermore its easy to see that the hypothesis is actually:
And our thetas are:
Now let’s use the normal equation to confirm our belief. To begin we constructing the design matrix X and the target vector y.
X = trainingData[['x0', 'x1']] y = trainingData[['y']]
Next we transpose X, using a shorthand (T) for the pandas transpose method. Since we are transposing a 3×2 matrix we can expect to end up with a 2×3 matrix as a result.
Applying the Normal Equation
Next we calculate X transpose multiplied by X. Since we are doing matrix multiplication, as opposed to scalar, we will need to use the pandas DataFrame.dot() function function.
xTx = X.T.dot(X)
We then take the inverse of our product using the numpy inverse function.
XtX = np.linalg.inv(xTx)
array([[ 1.5 , -0.5 ],
[-0.5 , 0.21428571]])
We multiply the inverse by the transpose of x, which we previously calculated.
XtX_xT = XtX.dot(X.T)
array([[ 1. , 0.5 , -0.5 ],
[-0.28571429, -0.07142857, 0.35714286]])
Finally we multiply the previous result by our target vector.
theta = XtX_xT.dot(y)
array([[ 0.], [ 1.]])
Final Confirmation Plot
The normal equation had confirmed our initial guess that the function was . Finally we visualize the hypothesis with a confirmation plot.
# generate the y axis for the hypothesis function hypothesis = [(x, theta + x*theta) for x in range(6)] with sns.axes_style("darkgrid"): fig, ax = plt.subplots() ax.set_title('Linear Regression with the Normal Equation') ax.plot(trainingData['x1'], trainingData['y'], 'o', label = 'data') ax.plot([x for x in range(6)], hypothesis, 'k-', label = 'hypothesis') ax.legend(['Data', 'Hypothesis'], loc='best') ax.set(ylim=(0, 5)) ax.set(xlim=(0, 5))