论文信息 - EM Algorithms for PCA and SPCA

EM Algorithms for PCA and SPCA

I present an expectation-maximization (EM) algorithm for principal component analysis (PCA). The algorithm allows a few eigenvectors and eigenvalues to be extracted from large collections of high dimensional data. It is computationally very efficient in space and time. It also naturally accommodates missing information. I also introduce a new variant of PCA called sensible principal component analysis (SPCA) which defines a proper density model in the data space. Learning for SPCA is also done with an EM algorithm. I report results on synthetic and real data showing that these EM algorithms correctly and efficiently find the leading eigenvectors of the covariance of datasets in a few iterations using up to hundreds of thousands of datapoints in thousands of dimensions.

Sam T. Roweis | S. Roweis

[1] J. H. Wilkinson. The algebraic eigenvalue problem , 1966 .

[2] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3] Gene H. Golub,et al. Matrix computations , 1983 .

[4] L. Sirovich. Turbulence and the dynamics of coherent structures. I. Coherent structures , 1987 .

[5] Michael I. Jordan,et al. Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[6] Geoffrey E. Hinton,et al. The EM algorithm for mixtures of factor analyzers , 1996 .

[7] Michael E. Tipping,et al. Mixtures of Principal Component Analysers , 1997 .

[8] Chao Yang,et al. ARPACK users' guide - solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods , 1998, Software, environments, tools.

[9] Michael E. Tipping,et al. Probabilistic Principal Component Analysis , 1999 .