|
- The Kullback Leibler distance (KL-distance, KL-divergence)
is a natural distance function
from a "true" probability distribution, p,
to a "target" probability distribution, q.
It can be interpreted as the expected extra message-length per datum
due to using a code based on the wrong (target) distribution compared to
using a code based on the true distribution.
-
- For discrete (not necessarily finite) probability distributions,
p={p1, ..., pn} and
q={q1, ..., qn},
the KL-distance is defined to be
-
- KL(p, q) =
Σi
pi . log2( pi / qi )
-
- For continuous probability densities,
the sum is replaced by an integral.
-
- Note that
-
- KL(p, p) = 0
- KL(p, q) ≥ 0
-
- and that the KL-distance is not, in general, symmetric.
- However, a symmetric distance can be made, e.g.,
- KL(p, q) + KL(q, p)
- (sometimes divided by two).
|
|