von Mises - Fisher (vMF)
- The von Mises - Fisher (vMF) distribution is a probability distribution on directions in RD. It is natural to think of it as a distribution on the (D-1)-sphere of unit radius, that is on the surface of the D-ball of unit radius.
- The von Mises - Fisher's probability density function is
- pdf(v | μ, κ) = CD eκμ.v
- where datum v is a normalised D-vector,
equivalently a point on the (D-1)-sphere,
- mu, μ, is the mean (a normalised vector), and
- kappa, κ ≥ 0, is the concentration parameter (a scalar).
- mu, μ, is the mean (a normalised vector), and
- The distribution's normalising constant
- CD(κ) =
κD/2-1
/ {(2π)D/2 ID/2-1(κ)}
- where Iorder(.) is the "modified Bessel function of the first kind"!
- In the special case that D = 3,
- C3(κ) = κ / {2π (eκ - e-κ)}
- where datum v is a normalised D-vector,
equivalently a point on the (D-1)-sphere,
- The negative log pdf is
- - log pdf(v | μ, κ) = - log CD - κ μ . v,
- and
-
CD - Given data
N
- R = ∑i=0..N-1 vi,
R - and
- Rbar = ||R|| / N.
-
- logLH - - logLH = - N log CD - κ μ . R.
- It is obvious that the maximum likelihood estimate of μ is R normalised,
- μML = R / ||R||,
-
μMML - μMML = μML = R / ||R||,
- the most general prior for μ being the uniform distribution.
- For given μ and κ, the expected value of Rbar equals
-
AD(κ) - and the (less obvious) maximum likelihood estimate of κ is
- κML = A-1(Rbar).
- This is because
- ∂/∂κ - logLH = - N {∂/∂κ log CD(κ)} - μ . R
- which is zero if
- - ∂/∂κ log CD(κ) = μ . R / N,
- where
- ∂/∂κ log CD(κ)
- = ω / κ - I'ω(κ) / Iω(κ), where ω = D/2 - 1
- = ω {Iω(κ) - κ/ω I'ω(κ)} / (κ Iω(κ))
- = ω {κ/2ω {Iω-1(κ) - Iω+1(κ)} - κ/2ω {Iω-1(κ) + Iω+1(κ)}} / (κ Iω(κ))
- = - ID/2(κ) / ID/2-1(κ),
- = ω / κ - I'ω(κ) / Iω(κ), where ω = D/2 - 1
- using the "well known" relations,
- Iν(z) = z/2ν {Iν-1(z) - Iν+1(z)},
- and
- I'ν(z)
= 1/2
{Iν-1(z) + Iν+1(z)},
(I'0(z)
= I1(z)).
- The MML estimate, κMML,
κMML - The Fisher information of the vMF distribution.
- The expected second derivative of - logLH w.r.t. κ is
- ∂2/∂κ2 - logLH = N A'D(κ).
- The vMF distribution is symmetric about μ on the (D-1)-sphere; there is no preferred orientation around μ. A direction, such as μ, has D - 1 degrees of freedom. The expected 2nd derivative of - logLH w.r.t. any one of μ's degrees of freedom is
- N κ AD(κ).
- This is for the following reason:
- Without loss of generality, let
μ = (1, 0, ...), and then
μ → (cos δ, sin δ, 0, ...), say,
where δ is small,
- ∂/∂δ - logLH = N κ ||R|| sin δ,
- ∂2/∂δ2 - logLH = N κ ||R|| cos δ ≈ N κ ||R||, as δ is small
- which is
- N κ AD(κ) in expectation.
- ∂/∂δ - logLH = N κ ||R|| sin δ,
- Symmetry implies that the off-diagonal elements for μ are zero.
And, μ is a position parameter and κ a scale parameter,
so the off-diagonal elements between μ and κ are also zero.
- F, the Fisher information of the vMF is therefore,
-
F - Sources
- Search for [vonMises direction] in the
[Bib], and
- see section 6.5, p.266 of Wallace's book (2005).
- P. Kasarapu & L. Allison, Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions, Machine Learning (Springer Verlag), March 2015 [click].
- The special case of the probability distribution where D = 2 is known as the von Mises distribution for directions in R2, that is for angles and periodic quantitites such as annual events.
- see section 6.5, p.266 of Wallace's book (2005).