Extending mixtures of factor models using the restricted multivariate skew-normal distribution

The mixture of factor analyzers (MFA) model provides a powerful tool for analyzing high-dimensional data as it can reduce the number of free parameters through its factor-analytic representation of the component covariance matrices. This paper extends the MFA model to incorporate a restricted version of the multivariate skew-normal distribution for the latent component factors, called mixtures of skew-normal factor analyzers (MSNFA). The proposed MSNFA model allows us to relax the need of the normality assumption for the latent factors in order to accommodate skewness in the observed data. The MSNFA model thus provides an approach to model-based density estimation and clustering of high-dimensional data exhibiting asymmetric characteristics. A computationally feasible Expectation Conditional Maximization (ECM) algorithm is developed for computing the maximum likelihood estimates of model parameters. The potential of the proposed methodology is exemplified using both real and simulated data.

[1]  Donald B. Rubin,et al.  Max-imum Likelihood from Incomplete Data , 1972 .

[2]  Xiao-Li Meng,et al.  Using EM to Obtain Asymptotic Variance-Covariance Matrices: The SEM Algorithm , 1991 .

[3]  A. Azzalini,et al.  The multivariate skew-normal distribution , 1996 .

[4]  Paul D. McNicholas,et al.  Parsimonious Gaussian mixture models , 2008, Stat. Comput..

[5]  L. Hubert,et al.  Comparing partitions , 1985 .

[6]  Ranjan Maitra,et al.  Simulating Data to Study Performance of Finite Mixture Modeling and Clustering Algorithms , 2010 .

[7]  A. Azzalini The Skew‐normal Distribution and Related Multivariate Families * , 2005 .

[8]  A. Azzalini A class of distributions which includes the normal ones , 1985 .

[9]  B. Everitt,et al.  Finite Mixture Distributions , 1981 .

[10]  M. Drton Likelihood ratio tests and singularities , 2007, math/0703360.

[11]  M. Genton,et al.  On fundamental skew distributions , 2005 .

[12]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[13]  Geoffrey J. McLachlan,et al.  Modelling high-dimensional data by mixtures of factor analyzers , 2003, Comput. Stat. Data Anal..

[14]  Geoffrey J. McLachlan,et al.  Finite mixtures of multivariate skew t-distributions: some recent and new results , 2014, Stat. Comput..

[15]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Chris Fraley,et al.  Algorithms for Model-Based Gaussian Hierarchical Clustering , 1998, SIAM J. Sci. Comput..

[18]  Thomas Brendan Murphy,et al.  Computational aspects of fitting mixture models via the expectation-maximization algorithm , 2012, Comput. Stat. Data Anal..

[19]  Tsung-I Lin,et al.  Flexible mixture modelling using the multivariate skew-t-normal distribution , 2014, Stat. Comput..

[20]  Jill P. Mesirov,et al.  Automated High-Dimensional Flow Cytometric Data Analysis , 2010, RECOMB.

[21]  A. Utsugi,et al.  Bayesian Analysis of Mixtures of Factor Analyzers , 2001, Neural Computation.

[22]  D. Rubin,et al.  Parameter expansion to accelerate EM: The PX-EM algorithm , 1998 .

[23]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[24]  Ryan P. Browne,et al.  A mixture of common skew‐t factor analysers , 2013, 1307.5558.

[25]  R. Arellano-Valle,et al.  LIKELIHOOD BASED INFERENCE FOR SKEW-NORMAL INDEPENDENT LINEAR MIXED MODELS , 2010 .

[26]  Piotr A. Kowalski,et al.  Complete Gradient Clustering Algorithm for Features Analysis of X-Ray Images , 2010 .

[27]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[28]  Geoffrey E. Hinton,et al.  The EM algorithm for mixtures of factor analyzers , 1996 .

[29]  Geoffrey J. McLachlan,et al.  Modelling mass−size particle data by finite mixtures , 1989 .

[30]  R. Arellano-Valle,et al.  On the Unification of Families of Skew‐normal Distributions , 2006 .

[31]  Ramin Zabih,et al.  The 30th Anniversary of the IEEE Transactions on Pattern Analysis and Machine Intelligence , 2010, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Ryan P. Browne,et al.  A mixture of generalized hyperbolic factor analyzers , 2013, Advances in Data Analysis and Classification.

[33]  N. Shephard,et al.  Non‐Gaussian Ornstein–Uhlenbeck‐based models and some of their uses in financial economics , 2001 .

[34]  Wan-Lun Wang,et al.  An efficient ECM algorithm for maximum likelihood estimation in mixtures of t-factor analyzers , 2012, Computational Statistics.

[35]  Adrian E. Raftery,et al.  MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering † , 2007 .

[36]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[37]  Sylvia Frühwirth-Schnatter,et al.  Finite Mixture and Markov Switching Models , 2006 .

[38]  Kjersti Aas,et al.  The Generalized Hyperbolic Skew Student’s t-Distribution , 2006 .

[39]  Xiao-Li Meng,et al.  The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune , 1997 .

[40]  A. Goldman An Introduction to Regression Graphics , 1995 .

[41]  Ryan P. Browne,et al.  Mixtures of skew-t factor analyzers , 2013, Comput. Stat. Data Anal..

[42]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[43]  A. Azzalini,et al.  Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t‐distribution , 2003, 0911.2342.

[44]  Tsung-I Lin,et al.  Finite mixture modelling using the skew normal distribution , 2007 .

[45]  Dimitris Karlis,et al.  Choosing Initial Values for the EM Algorithm for Finite Mixtures , 2003, Comput. Stat. Data Anal..

[46]  Sharon X. Lee,et al.  EMMIXuskew: An R Package for Fitting Mixtures of Multivariate Skew t Distributions via the EM Algorithm , 2012, 1211.5290.

[47]  B. Lindsay Mixture models : theory, geometry, and applications , 1995 .

[48]  Tsung I. Lin,et al.  Maximum likelihood estimation for multivariate skew normal mixture models , 2009, J. Multivar. Anal..

[49]  Geoffrey J. McLachlan,et al.  Mixtures of Factor Analyzers with Common Factor Loadings: Applications to the Clustering and Visualization of High-Dimensional Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Sharon X. Lee,et al.  EMMIX-uskew: An R Package for Fitting Mixtures of Multivariate Skew t-distributions via the EM Algorithm , 2012 .

[51]  Gilles Celeux,et al.  Combining Mixture Components for Clustering , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[52]  Christophe Biernacki,et al.  Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models , 2003, Comput. Stat. Data Anal..

[53]  Geoffrey J. McLachlan,et al.  A mixture model-based approach to the clustering of microarray expression data , 2002, Bioinform..

[54]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[55]  B. Efron,et al.  Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information , 1978 .

[56]  Geoffrey E. Hinton,et al.  SMEM Algorithm for Mixture Models , 1998, Neural Computation.

[57]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[58]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[59]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[60]  Jianhua Zhao,et al.  Fast ML Estimation for the Mixture of Factor Analyzers via an ECM Algorithm , 2008, IEEE Transactions on Neural Networks.

[61]  D. M. Titterington,et al.  Mixtures of Factor Analysers. Bayesian Estimation and Inference by Stochastic Simulation , 2004, Machine Learning.

[62]  A. Azzalini,et al.  Statistical applications of the multivariate skew normal distribution , 2009, 0911.2093.

[63]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[64]  Wan-Lun Wang,et al.  Mixtures of common factor analyzers for high-dimensional data with missing information , 2013, J. Multivar. Anal..

[65]  Geoffrey J. McLachlan,et al.  On mixtures of skew normal and skew $$t$$-distributions , 2012, Adv. Data Anal. Classif..

[66]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[67]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[68]  Ranjan Maitra Initializing Partition-Optimization Algorithms , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[69]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[70]  Volodymyr Melnykov,et al.  Initializing the EM algorithm in Gaussian mixture models with an unknown number of components , 2012, Comput. Stat. Data Anal..

[71]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[72]  Angela Montanari,et al.  A skew-normal factor model for the analysis of student satisfaction towards university courses , 2010 .

[73]  Wei-Chen Chen,et al.  MixSim: An R Package for Simulating Data to Study Performance of Clustering Algorithms , 2012 .

[74]  Wan-Lun Wang,et al.  Mixtures of common t-factor analyzers for modeling high-dimensional data with missing values , 2015, Comput. Stat. Data Anal..

[75]  S. Sahu,et al.  A new class of multivariate skew distributions with applications to bayesian regression models , 2003 .

[76]  Ryan P. Browne,et al.  Parsimonious Shifted Asymmetric Laplace Mixtures , 2013, 1311.0317.

[77]  Geoffrey J. McLachlan,et al.  Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution , 2007, Comput. Stat. Data Anal..

[78]  Yasuo Amemiya,et al.  Mixture Factor Analysis for Approximating a Nonnormally Distributed Continuous Latent Factor With Continuous and Dichotomous Observed Variables , 2012, Multivariate behavioral research.

[79]  Mortaza Jamshidian,et al.  An EM Algorithm for ML Factor Analysis with Missing Data , 1997 .

[80]  Geoffrey J. McLachlan,et al.  Mixtures of Factor Analyzers , 2000, International Conference on Machine Learning.

[81]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[82]  Michael A. West,et al.  BAYESIAN MODEL ASSESSMENT IN FACTOR ANALYSIS , 2004 .

[83]  Dankmar Böhning,et al.  Computer-Assisted Analysis of Mixtures and Applications , 2000, Technometrics.

[84]  Ryan P. Browne,et al.  Mixtures of 'Unrestricted' Skew-t Factor Analyzers , 2013 .

[85]  Volodymyr Melnykov,et al.  On the distribution of posterior probabilities in finite mixture models with application in clustering , 2013, J. Multivar. Anal..

[86]  Geoffrey J. McLachlan,et al.  Mixtures of common t-factor analyzers for clustering high-dimensional microarray data , 2011, Bioinform..

[87]  Kerrie Mengersen,et al.  Mixtures: Estimation and Applications , 2011 .

[88]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[89]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[90]  M. Healy,et al.  Multivariate Normal Plotting , 1968 .