|
- The simplest linear regression as a case study:
-
- y = a.x + b + N(0,σ),
or equivalently N(a.x + b, σ).
-
- The probability density of y given x:
-
-
- Given {(xi, yi)}, i = 1..n,
where the xi are "common knowledge",
the negative log l.h. =
-
- L = (n/2)log(2 π) + (n/2)log(σ2)
+
(1/(2 σ2))Σ{yi-a.xi-b}2
-
- = (n/2)log(2 π) + n.log(σ)
+
(1/(2 σ2))Σ{yi-a.xi-b}2
-
- First partial derivatives...
-
- d L / d a =
(-1/σ2)
Σ{ xi.(yi-a.xi-b) }
-
- = (-1/σ2)
Σ{ xi.yi-a.xi2-xi.b }
-
- d L / d b =
(-1/σ2)
Σ{yi-a.xi-b}
-
- = (-1/σ2)
{ (Σ yi) - a(Σ xi) - n.b}
-
- d L / d σ =
n/σ
- (1/σ3)Σ{yi-a.xi-b}2
-
- L is minimized when the line passes through the C of G
of the points (which leaves the slope, a).
a = Σ xi(yi-b) / Σ xi2,
and σ is the sqrt of the residual variance.
-
- Second partial derivatives...
-
- d2 L / d a2 =
(+1/σ2)
Σ{ xi2 }
- (and remember, the xi are common knowledge)
-
- d2 L / d b2 = n/σ2
-
- d2 L / d σ2 =
- n/σ2 +
(3/σ4)Σ{yi-a.xi-b}2
- expectation = 2 n / σ2
-
- Off-diagonal second partial derivatives...
-
- d2 L / d a.d b =
(+1/σ2)Σ xi
- = n . mean{xi} / σ2
-
- d2 L / d a.d σ =
(+2/σ3)
Σ{ xi.(yi-a.xi-b) }
- expectation = 0
-
- d2 L / d b.d σ =
(+2/σ3)Σ{yi-a.xi-b}
- expectation = 0
-
- Fisher
-
| a | b | σ |
a |
Ey d2 L / d a2 |
Ey d2 L / d a d b |
0 |
b |
Ey d2 L / d a d b |
Ey d2 L / d b2 |
0 |
σ |
0 |
0 |
Ey d2 L / d σ2 |
-
- F
= 2 n
{ n.(Σ xi2)
- (n.mean{xi})2 } / σ6
- = 2 n3 { (Σ xi2) / n
- (mean{xi})2 } / σ6
- = 2 n3 variance{xi} / σ6
Priors
- a = tan θ where &theta is the angular slope.
- d a / d θ
= 1 / cos2θ
= 1 / (1 + a2).
- The uniform prior, 1 / π, on θ corresponds to
- the prior pr(a) = 1 / (π (1+a2)) on 'a'.
- b can be untangled from 'a' by making the C of G the origin.
- Then b ~ μ (of the {yi}) in the
[normal distribution].
-- L.A. @ Dept. Comp. Sci., U. York, 12/2004
|
|