^MML^
^Information^
Examples K-L Distance
Code lengths for {A, C, G, T}
Jumping ahead a little,
we would get these average code lengths,
SUMi{ pi.log(qi) },
for {A,C,G,T}
for the following true and
assumed probabilities of the bases:
{A, C, G, T} |
assumed probabilities |
q = (1/4, 1/4, 1/4, 1/4) |
q' = (1/2, 1/4, 1/8, 1/8) |
true prob's
|
p = (1/4, 1/4, 1/4, 1/4) |
2 =
1/4*2+1/4*2+1/4*2+1/4*2 |
2 1/4
=
1/4*1+1/4*2+1/4*3+1/4*3 |
p' = (1/2, 1/4, 1/8, 1/8) |
2 = 1/2*2+1/4*2+1/8*2+1/8*2 |
1 3/4
=
1/2*1+1/4*2+1/8*3+1/8*3 |
K-L Distance
It just happens that in the above example the KL distances between
the two distributions,
KL(fair->biased) = KL(biased->fair) = 1/4,
are equal - in this case.
In general, and in the following example,
the K-L distance is not symmetric:
{A, C, G, T} |
assumed probabilities |
q = (1/2, 1/4, 1/8, 1/8) |
q' = (1/4, 1/8, 1/8, 1/2) |
true prob's
|
p = (1/2, 1/4, 1/8, 1/8) |
1 3/4 =
1/2*1+1/4*2+1/8*3+1/8*3 |
2 1/4
=
1/2*2+1/4*3+1/8*3+1/8*1 |
p' = (1/4, 1/8, 1/8, 1/2) |
2 3/8 =
1/4*1+1/8*2+1/8*3+1/2*3 |
1 3/4
=
1/4*2+1/8*3+1/8*3+1/2*1 |
KL(p->p') = 1/2, but KL(p'->p) = 5/8.
© L. Allison 2000.
Created with "vi (Linux + IRIX)", charset=iso-8859-1