Geometric Distribution
- Often, P(x|p) = (1-p)x-1 . p,
integer x≥1,
μ=1/p,
μ≥1, but
here, P(x|p) = (1-p)x . p, integer x≥0, μ=(1/p)-1, μ≥0, p=1/(μ+1), 1-p=μ/(μ+1). - In μ-space:
- p = 1/(μ+1), so
- P(x|μ) = (1 - 1/(μ+1))x / (μ+1)
- = (μ / (μ+1))x / (μ+1)
- P(x|μ) = (1 - 1/(μ+1))x / (μ+1)
- Given n data, x1, ..., xn, the likelihood
- = P(x1, ..., xn | μ)
= (μ / (μ+1))∑xi / (μ+1)n
- neg log likelihood
- L = (∑xi).(log(μ+1) - log μ) + n.log(μ+1)
- 1st derivative
- d L / d μ =
(∑xi).(1/(μ+1) - 1/μ) + n/(μ+1)
-
If we equate this to zero,
(∑xi).μ - (∑xi).(μ+1) + n.μ = 0,
μmaxLH = (∑xi) / n.
-
- 2nd derivative
- d2 L / d μ2 =
(∑xi).(1/μ2 - 1/(μ+1)2) - n/(μ+1)2
Note that E ∑xi = n.μ. - which has expectation, i.e., Fisher information, Fμ
- = n.μ.(1/μ2 - 1/(μ+1)2) - n/(μ+1)2
- = n/μ - n.μ/(μ+1)2 - n/(μ+1)2
- = n.(1/μ - 1/(μ+1))
- = n / (μ (μ+1))
- which has expectation, i.e., Fisher information, Fμ
- Assume prior, h μ = (1/A).e-μ/A,
which has mean A. - The two-part message length, m
- = - log(h μ) + L + (1/2)log Fμ + (-log 12 + 1)/2
- = log A + μ/A - (∑xi).log(μ/(μ+1)) + n.log(μ+1)
+ (1/2)log n - (1/2)logμ - (1/2)log(μ+1) + (-log 12 + 1)/2 - = log A + μ/A - (∑xi).log(μ/(μ+1)) + n.log(μ+1)
- To estimate μ, differentiate m with respect
to μ - d m / d μ
- = 1/A + (∑xi).{1/(μ+1) - 1/μ}
+ n/(μ+1) - 1/(2μ) - 1/(2(μ+1)) - = 1/A + (1/(μ+1)).{∑xi + n - 1/2} - (1/μ).{∑xi + 1/2}
- = 1/A + (∑xi).{1/(μ+1) - 1/μ}
- equate to zero, multiply by μ(μ+1)
- 0 = μ(μ+1)/A + μ{∑xi + n - 1/2} - (μ+1){∑xi + 1/2}
- = μ2/A + μ{1/A + n - 1} - 1/2 - ∑xi
- (Note that if A is "very large",
μMML = (∑xi + 1/2) / (n - 1).)
- = μ2/A + μ{1/A + n - 1} - 1/2 - ∑xi
- The quadratic has solutions
- μMML
= (1 - n - 1/A
±√{n2 + 1/A2 + 1 + 2n/A - 2/A - 2n + 2/A + 4(∑xi)/A}) / (2/A) - = (1 - n - 1/A
±√{n2 + 1/A2 + 1 + 2n/A - 2n + 4(∑xi)/A}) / (2/A) - = (1 - n - 1/A
- only the "+" solution is admissible.
(Also see Poisson.) |
-- L.A., July 2007.
Thanks to Daniel Schmidt and Enes Makalic.
See [IP 1.2] for an implementation.