Intrinsic dimension estimation by maximum likelihood in isotropic probabilistic PCA

A central issue in dimension reduction is choosing a sensible number of dimensions to be retained. This work demonstrates the surprising result of the asymptotic consistency of the maximum likelihood criterion for determining the intrinsic dimension of a dataset in an isotropic version of probabilistic principal component analysis (PPCA). Numerical experiments on simulated and real datasets show that the maximum likelihood criterion can actually be used in practice and outperforms existing intrinsic dimension selection criteria in various situations. This paper exhibits and outlines the limits of the maximum likelihood criterion. It leads to recommend the use of the AIC criterion in specific situations. A useful application of this work would be the automatic selection of intrinsic dimensions in mixtures of isotropic PPCA for classification.

[1]  Max Welling,et al.  Extreme Components Analysis , 2003, NIPS.

[2]  Balázs Kégl,et al.  Intrinsic Dimension Estimation Using Packing Numbers , 2002, NIPS.

[3]  Juha Karhunen,et al.  Representation and separation of signals using nonlinear PCA type learning , 1994, Neural Networks.

[4]  Anil K. Jain,et al.  An Intrinsic Dimensionality Estimator from Near-Neighbor Information , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  K. I. WilliamsDivision,et al.  Products of Gaussians and Probabilistic Minor Component Analysis , 2002, Neural Computation.

[6]  Charles Bouveyron,et al.  Robust supervised classification with mixture models: Learning from data with uncertain labels , 2009, Pattern Recognit..

[7]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[8]  Keinosuke Fukunaga,et al.  An Algorithm for Finding Intrinsic Dimensionality of Data , 1971, IEEE Transactions on Computers.

[9]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[10]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[11]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[12]  Bo Zhang,et al.  Intrinsic dimension estimation of manifolds by incising balls , 2009, Pattern Recognit..

[13]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[14]  Tom Minka,et al.  Automatic Choice of Dimensionality for PCA , 2000, NIPS.

[15]  C. Schmid,et al.  High-Dimensional Discriminant Analysis , 2005 .

[16]  Bernard Chalmond,et al.  Nonlinear Modeling of Scattered Multivariate Data and Its Application to Shape Change , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Guillaume Bouchard,et al.  Selection of generative models in classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Robert P. W. Duin,et al.  An Evaluation of Intrinsic Dimensionality Estimators , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Francesco Camastra,et al.  Data dimensionality estimation methods: a survey , 2003, Pattern Recognit..

[20]  David E. Tyler Asymptotic Inference for Eigenvectors , 1981 .

[21]  Adrian E. Raftery,et al.  Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering , 2007, J. Classif..

[22]  H. Akaike A new look at the statistical model identification , 1974 .

[23]  Gerald Sommer,et al.  Intrinsic Dimensionality Estimation With Optimally Topology Preserving Maps , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Christopher M. Bishop,et al.  Bayesian PCA , 1998, NIPS.

[25]  Richard M. Everson,et al.  Inferring the eigenvalues of covariance matrices from limited, noisy data , 2000, IEEE Trans. Signal Process..

[26]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[27]  J. J. Rajan,et al.  Model Order Selection For The Singular Value Decomposition And The Discrete Karhunen-Loeve Transform Using A Bayesian Approach , 1997 .

[28]  Isobel Claire Gormley,et al.  Probabilistic principal component analysis for metabolomic data , 2010, BMC Bioinformatics.

[29]  Alexander Basilevsky,et al.  Statistical Factor Analysis and Related Methods , 1994 .

[30]  Stan Lipovetsky,et al.  Latent Variable Models and Factor Analysis , 2001, Technometrics.

[31]  Christopher K. I. Williams,et al.  Products of Gaussians and Probabilistic Minor Component Analysis , 2002, Neural Computation.

[32]  I. Jolliffe Principal Component Analysis , 2002 .

[33]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[34]  Bart J. A. Mertens,et al.  Biomarker discovery in MALDI-TOF serum protein profiles using discrete wavelet transformation , 2009, Bioinform..

[35]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[36]  Sam T. Roweis,et al.  EM Algorithms for PCA and SPCA , 1997, NIPS.