von Mises - Fisher (vMF)
- The von Mises - Fisher (vMF) distribution is a probability distribution on directions in RD. It is natural to think of it as a distribution on the (D-1)-sphere of unit radius, that is on the surface of the D-ball of unit radius.
- The von Mises - Fisher's probability density function is
- pdf(v | μ, κ) = CD eκμ.v
  - where datum v is a normalised D-vector,
    equivalently a point on the (D-1)-sphere,
  - mu, μ, is the mean (a normalised vector), and
- kappa, κ ≥ 0, is the concentration parameter (a scalar).
- mu, μ, is the mean (a normalised vector), and
 
- The distribution's normalising constant
- CD(κ) =
  κD/2-1
  / {(2π)D/2 ID/2-1(κ)}
  - where Iorder(.) is the "modified Bessel function of the first kind"!
- In the special case that D = 3,
- C3(κ) = κ / {2π (eκ - e-κ)}
 
- where datum v is a normalised D-vector,
    equivalently a point on the (D-1)-sphere,
  
- The negative log pdf is
- - log pdf(v | μ, κ) = - log CD - κ μ . v,
- and
- 
  
 log CD = (D/2-1)log κ - (D/2)log 2π - log ID/2-1(κ).CD 
- Given data
  
 {v0, ..., vN-1}, define their sum (a D-vector),N 
 
- R = ∑i=0..N-1 vi,
  R 
- and
- Rbar = ||R|| / N.
- 
  
 The negative log likelihood is- logLH 
- - logLH = - N log CD - κ μ . R.
- It is obvious that the maximum likelihood estimate of μ is R normalised,
- μML = R / ||R||,
- 
  
 and that the MML estimate is the same,μMML 
- μMML = μML = R / ||R||,
- the most general prior for μ being the uniform distribution.
- For given μ and κ, the expected value of Rbar equals
- 
  
 AD(κ) = ID/2(κ) / ID/2-1(κ),AD(κ) 
- and the (less obvious) maximum likelihood estimate of κ is
- κML = A-1(Rbar).
- This is because
- ∂/∂κ - logLH = - N {∂/∂κ log CD(κ)} - μ . R
- which is zero if
- - ∂/∂κ log CD(κ) = μ . R / N,
- where
- ∂/∂κ log CD(κ)
- = ω / κ - I'ω(κ) / Iω(κ), where ω = D/2 - 1
- = ω {Iω(κ) - κ/ω I'ω(κ)} / (κ Iω(κ))
- = ω {κ/2ω {Iω-1(κ) - Iω+1(κ)} - κ/2ω {Iω-1(κ) + Iω+1(κ)}} / (κ Iω(κ))
- = - ID/2(κ) / ID/2-1(κ),
- = ω / κ - I'ω(κ) / Iω(κ), where ω = D/2 - 1
- using the "well known" relations,
- Iν(z) = z/2ν {Iν-1(z) - Iν+1(z)},
- and
- I'ν(z)
   = 1/2
    {Iν-1(z) + Iν+1(z)}, 
     (I'0(z)
                      = I1(z)). 
- The MML estimate, κMML,
  
 is the value that minimises the two-part message length; no closed form is known for κMML. The message length calculations also require a choice of prior for κ, and the vMF's Fisher information, F.κMML 
- The Fisher information of the vMF distribution.
- The expected second derivative of - logLH w.r.t. κ is
- ∂2/∂κ2 - logLH = N A'D(κ).
- The vMF distribution is symmetric about μ on the (D-1)-sphere; there is no preferred orientation around μ. A direction, such as μ, has D - 1 degrees of freedom. The expected 2nd derivative of - logLH w.r.t. any one of μ's degrees of freedom is
- N κ AD(κ).
  - This is for the following reason:
- Without loss of generality, let
     μ = (1, 0, ...),  and then
     μ → (cos δ, sin δ, 0, ...),  say,
    where δ is small,
  - ∂/∂δ - logLH = N κ ||R|| sin δ,
- ∂2/∂δ2 - logLH = N κ ||R|| cos δ ≈ N κ ||R||, as δ is small
- which is
- N κ AD(κ) in expectation.
- ∂/∂δ - logLH = N κ ||R|| sin δ,
 
- Symmetry implies that the off-diagonal elements for μ are zero.
  And, μ is a position parameter and κ a scale parameter,
  so the off-diagonal elements between μ and κ are also zero.
- F, the Fisher information of the vMF is therefore,
- 
  
 F = ND (κ AD(κ))D-1 A'D(κ).F 
- Sources
- Search for [vonMises direction] in the
  [Bib], and
- see section 6.5, p.266 of Wallace's book (2005).
- P. Kasarapu & L. Allison, Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions, Machine Learning (Springer Verlag), March 2015 [click].
- The special case of the probability distribution where D = 2 is known as the von Mises distribution for directions in R2, that is for angles and periodic quantitites such as annual events.
- see section 6.5, p.266 of Wallace's book (2005).