|
KL-distance
from Nμ1,σ1
to Nμ2,σ2
(Also known as KL-divergence.)
- The general form is
-
- ∫x {
pdf1(x).{ log(pdf1(x)) - log(pdf2(x)) }}
-
- we have two normals so pdf1(x) is
Nμ1,σ1(x), etc..
-
- = ∫x
Nμ1,σ1(x).{
log(Nμ1,σ1(x))
-
log(Nμ2,σ2(x))
}
-
- = ∫x
Nμ1,σ1(x).{
(1/2)(
- ((x-μ1)/σ1)2
+ ((x-μ2)/σ2)2
)
+ ln(σ2/σ1)
}
-
- can replace x with x+μ1.
The expected value of x2 is σ12.
Terms that are odd in x, and otherwise
symmetric about zero, cancel out over [-∞,∞]
leaving the ...x2 and ...constant terms.
-
- = (1/2){
- (σ1/σ1)2
+ (σ1/σ2)2
+ ((μ1-μ2)/σ2)2
}
+ ln(σ2/σ1)
-
- = {
(μ1 - μ2)2
+ σ12
- σ22
} / (2.σ22)
+ ln(σ2/σ1)
-
- This is zero if
μ1=μ2 and
σ1=σ2.
It obviously increases with |μ1-μ2| and
has rather complex behaviour with
σ1 and σ2
(and is consistent
P&R,
and with J&S where σ1=σ2).
- KL(N(μq,σq) ||
N(μp,σp)), p.18 of
Penny & Roberts, PARG-00-12, 2000.
- KL(N(μ1,σ), N(μ2,σ))
= (μ1-μ2)2/(2σ2),
Johnson & Sinanovic, NB. a common σ
[pdf].
- Note that the distance is convenient to integrate over, say, a range
of μ1 & σ1:
-
∫
| μ1max
| ∫
| σ1max
|
|
| |
μ1min
| σ1min
|
|
|
+ ln σ2
- 1/2
| +
|
|
- ln σ1
|
|
NB. no σ1 here ...
|
|
... & no μ1
|
-
let |
f(μ1) =
|
|
+ μ1 . (ln σ2 - 1/2)
|
| and |
g(σ1) =
|
|
- σ1 . (ln σ1 - 1)
|
-
- =
(f(μ1max) - f(μ1min))
. (σ1max - σ1min)
+ (μ1max - μ1min)
. (g(σ1max) - g(σ1min))
|
|