Problem: Given a collection of data points find a function which best fits these data.
When performing linear regression one assumes that is a linear function. This assumption satisfies the first two desiderata, but seldom the last.
The methods of calculus work best with the least squares error method.
Set .
The equation is a regression equation for the data if is minimized by among the functions of a specified class (ie linear functions).
Find the linear function having least squares error for the data .
Write . We must solve for and . We compute
So we must minimize .
Differentiating,
(1) | |||
(2) |
Setting these equal to zero, multiplying the first equation by and the second equation by , we have
Subtracting, we conclude or that . Substituting into the equation we find so that
We should test that this point is a minimum.
(3) | |||
(4) | |||
(5) | |||
(6) |
and . Therefore, we have minimized .