A Novel Algorithm for Clustering of Data on the Unit Sphere via Mixture Models

A new maximum approximate likelihood (ML) estimation algorithm for the mixture of Kent distribution is proposed. The new algorithm is constructed via the BSLM (block successive lower-bound maximization) framework and incorporates manifold optimization procedures within it. The BSLM algorithm is iterative and monotonically increases the approximate log-likelihood function in each step. Under mild regularity conditions, the BSLM algorithm is proved to be convergent and the approximate ML estimator is proved to be consistent. A Bayesian information criterion-like (BIC-like) model selection criterion is also derive, for the task of choosing the number of components in the mixture distribution. The approximate ML estimator and the BIC-like criterion are both demonstrated to be successful via simulation studies. A model-based clustering rule is proposed and also assessed favorably via simulations. Example applications of the developed methodology are provided via an image segmentation task and a neural imaging clustering problem.

[1]  B. Leroux Consistent estimation of a mixing distribution , 1992 .

[2]  A. Yamaji,et al.  Clustering of fracture orientations using a mixed Bingham distribution and its application to paleostress analysis from dike or vein orientations , 2011 .

[3]  Jan de Leeuw,et al.  Block-relaxation Algorithms in Statistics , 1994 .

[4]  Robert R. Meyer,et al.  Sufficient Conditions for the Convergence of Monotonic Mathematical Programming Algorithms , 1976, J. Comput. Syst. Sci..

[5]  Pierre-Antoine Absil,et al.  Trust-Region Methods on Riemannian Manifolds , 2007, Found. Comput. Math..

[6]  Stuart Barber,et al.  All of Statistics: a Concise Course in Statistical Inference , 2005 .

[7]  Suvrit Sra,et al.  The multivariate Watson distribution: Maximum-likelihood estimation and other aspects , 2011, J. Multivar. Anal..

[8]  K. Hornik,et al.  Mixtures of von Mises-Fisher Distributions , 2014 .

[9]  Geoffrey J. McLachlan On the choice of starting values for the EM algorithm in fitting mixture models , 1988 .

[10]  R. Fisher Dispersion on a sphere , 1953, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[11]  K. Lange,et al.  MM Algorithms for Some Discrete Multivariate Distributions , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[12]  Miin-Shen Yang,et al.  An unsupervised clustering algorithm for data on the unit hypersphere , 2016, Appl. Soft Comput..

[13]  D. Louis Collins,et al.  Unbiased average age-appropriate atlases for pediatric studies , 2011, NeuroImage.

[14]  Wen Huang,et al.  ROPTLIB , 2018, ACM Trans. Math. Softw..

[15]  W. J. Whiten,et al.  Fitting Mixtures of Kent Distributions to Aid in Joint Set Identification , 2001 .

[16]  Robert E. Mahony,et al.  Optimization Algorithms on Matrix Manifolds , 2007 .

[17]  Kanti V. Mardia,et al.  Statistics of Directional Data , 1972 .

[18]  Ioan Mackenzie James,et al.  The topology of Stiefel manifolds , 1976 .

[19]  L. Hubert,et al.  Comparing partitions , 1985 .

[20]  Wen Huang,et al.  A Broyden Class of Quasi-Newton Methods for Riemannian Optimization , 2015, SIAM J. Optim..

[21]  J. Kent The Fisher‐Bingham Distribution on the Sphere , 1982 .

[22]  D. Louis Collins,et al.  ANIMAL+INSECT: Improved Cortical Structure Segmentation , 1999, IPMI.

[23]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[24]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[25]  C. Almli,et al.  Unbiased nonlinear average age-appropriate brain templates from birth to adulthood , 2009, NeuroImage.

[26]  Peter D. Hoff,et al.  Simulation of the Matrix Bingham–von Mises–Fisher Distribution, With Applications to Multivariate and Relational Data , 2007, 0712.4166.

[27]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[28]  Jean-Patrick Baudry,et al.  Estimation and model selection for model-based clustering with the conditional classification likelihood , 2012, 1205.4123.

[29]  Christopher Bingham An Antipodally Symmetric Distribution on the Sphere , 1974 .

[30]  J. Franke,et al.  On a Mixture Model for Directional Data on the Sphere , 2016 .

[31]  H. Akaike A new look at the statistical model identification , 1974 .

[32]  R. Jennrich Asymptotic Properties of Non-Linear Least Squares Estimators , 1969 .

[33]  Inderjit S. Dhillon,et al.  Clustering on the Unit Hypersphere using von Mises-Fisher Distributions , 2005, J. Mach. Learn. Res..

[34]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[35]  Wen Huang,et al.  ManifoldOptim: An R Interface to the ROPTLIB Library for Riemannian Manifold Optimization , 2016, Journal of Statistical Software.