Problem: Given a collection of data points
find a function
which best fits these data.
When performing linear regression one assumes that is
a linear function. This assumption satisfies the first two desiderata, but
seldom the last.
The methods of calculus work best with the least squares error method.
Set
.
The equation is a regression equation for the
data if
is minimized by
among the functions
of a specified class (ie linear functions).
Find the linear function having least squares error for the data
.
Write
. We must solve for
and
. We compute
So we must minimize
.
Differentiating,
![]() |
![]() |
![]() |
(1) |
![]() |
![]() |
![]() |
(2) |
Setting these equal to zero, multiplying the first equation by and
the second equation by
, we have
Subtracting, we conclude
or that
. Substituting into the equation
we find
so
that
We should test that this point is a minimum.
![]() |
![]() |
![]() |
(3) |
![]() |
![]() |
![]() |
(4) |
![]() |
![]() |
![]() |
(5) |
![]() |
![]() |
![]() |
(6) |
and
. Therefore,
we have minimized
.