Principal Component Analysis and Optimization: A Tutorial

Principal component analysis (PCA) is one of the most widely used multivariate tech- niques in statistics. It is commonly used to reduce the dimensionality of data in order to examine its underlying structure and the covariance/correlation structure of a set of variables. While singular value decomposition provides a simple means for identi- cation of the principal components (PCs) for classical PCA, solutions achieved in this manner may not possess certain desirable properties including robustness, smooth- ness, and sparsity. In this paper, we present several optimization problems related to PCA by considering various geometric perspectives. New techniques for PCA can be developed by altering the optimization problems to which principal component loadings are the optimal solutions.

[1]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[2]  Takeo Kanade,et al.  Robust L/sub 1/ norm factorization in the presence of outliers and missing data by alternative convex programming , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  F. E.,et al.  Natural Inheritance , 1889, Nature.

[4]  Clifford Stein,et al.  Approximating Semidefinite Packing Programs , 2011, SIAM J. Optim..

[5]  Robert D. Nowak,et al.  Online identification and tracking of subspaces from highly incomplete information , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[6]  José H. Dulá,et al.  A pure L1L1-norm principal component analysis , 2013, Comput. Stat. Data Anal..

[7]  Rui Portocarrero Sarmento,et al.  Introduction to Linear Regression , 2017 .

[8]  Daniel Zelterman Applied Linear Models with SAS: Introduction to Linear Regression , 2010 .

[9]  C. Jordan,et al.  Mémoire sur les formes bilinéaires. , 1874 .

[10]  D. Hawkins,et al.  Methods of L1 estimation of a covariance matrix , 1987 .

[11]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[12]  Genevera I. Allen,et al.  Journal of the American Statistical Association a Generalized Least-square Matrix Decomposition a Generalized Least-square Matrix Decomposition , 2022 .

[13]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[14]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[15]  I. Jolliffe Principal Component Analysis , 2002 .

[16]  Chris H. Q. Ding,et al.  K-means clustering via principal component analysis , 2004, ICML.

[17]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[18]  T. Kanade,et al.  Robust subspace computation using L1 norm , 2003 .

[19]  Stephen J. Wright,et al.  Local Convergence of an Algorithm for Subspace Identification from Partial Data , 2013, Found. Comput. Math..

[20]  H. V. Henderson,et al.  Building Multiple Regression Models Interactively , 1981 .

[21]  Panos P. Markopoulos,et al.  Optimal Algorithms for L1-subspace Signal Processing , 2014, IEEE Transactions on Signal Processing.

[22]  J. Brooks,et al.  A Pure L1-norm Principal Component Analysis. , 2013, Computational statistics & data analysis.

[23]  Nojun Kwak,et al.  Principal Component Analysis Based on L1-Norm Maximization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Shiqian Ma,et al.  Fast alternating linearization methods for minimizing the sum of two convex functions , 2009, Math. Program..

[25]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[26]  J. S. Marron,et al.  Backwards Principal Component Analysis and Principal Nested Relations , 2014, Journal of Mathematical Imaging and Vision.

[27]  Park Young Woong,et al.  Iteratively Reweighted Least Squares Algorithms for L1-Norm Principal Component Analysis , 2016 .

[28]  Feiping Nie,et al.  Robust Principal Component Analysis with Non-Greedy l1-Norm Maximization , 2011, IJCAI.

[29]  G. Strang The Fundamental Theorem of Linear Algebra , 1993 .

[30]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.