Kullback Leibler Distance (KL)
- The Kullback Leibler distance (KL-distance, KL-divergence) is a natural distance function from a "true" probability distribution, p, to a "target" probability distribution, q. It can be interpreted as the expected extra message-length per datum due to using a code based on the wrong (target) distribution compared to using a code based on the true distribution.
- For discrete (not necessarily finite) probability distributions, p={p1, ..., pn} and q={q1, ..., qn}, the KL-distance is defined to be
-
- KL(p, q) = Σi pi . log2( pi / qi )
- KL(p, q) = Σi pi . log2( pi / qi )
- For continuous probability densities, the sum is replaced by an integral.
- Note that
-
- KL(p, p) = 0
- KL(p, q) ≥ 0
- KL(p, p) = 0
- and that the KL-distance is not, in general, symmetric.
- However, a symmetric distance can be made, e.g.,
- KL(p, q) + KL(q, p)
- (sometimes divided by two).