Orthogonal component analysis: A fast dimensionality reduction algorithm

Most existing dimensionality reduction algorithms have two disadvantages: their computational cost is high and they cannot estimate the intrinsic dimension of the original dataset by themselves. To deal with these problems, in this paper we propose a fast linear dimensionality reduction method named Orthogonal Component Analysis (OCA). While avoiding solving eigenproblem and matrix inverse problem, OCA successfully achieves high-speed orthogonal component extraction. By proposing an adaptive threshold scheme, OCA is able to estimate the dimension of the feature space automatically. Meanwhile, the algorithm is guaranteed to be numerical stable. In the experiments, OCA is compared with several typical dimensionality reduction algorithms. The experimental results demonstrate that as a universal algorithm, OCA is efficient and effective.

[1]  Yoni Halpern A Comparison of Dimensionality Reduction Techniques for Unstructured Clinical Text , 2012 .

[2]  Man-Duen Choi TRICKS OR TREATS WITH THE HILBERT MATRIX , 1983 .

[3]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[4]  J KriegmanDavid,et al.  Eigenfaces vs. Fisherfaces , 1997 .

[5]  E.R. Dougherty,et al.  Feature selection in the classification of high-dimension data , 2008, 2008 IEEE International Workshop on Genomic Signal Processing and Statistics.

[6]  Charles M. Bishop Variational principal components , 1999 .

[7]  Roberto Kawakami Harrop Galvão,et al.  The successive projections algorithm , 2013 .

[8]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Walter Gander,et al.  Gram‐Schmidt orthogonalization: 100 years and more , 2013, Numer. Linear Algebra Appl..

[10]  Hujun Yin,et al.  Nonlinear dimensionality reduction and data visualization: A review , 2007, Int. J. Autom. Comput..

[11]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[12]  Å. Björck Solving linear least squares problems by Gram-Schmidt orthogonalization , 1967 .

[13]  Shuang-Hong Yang,et al.  Dimensionality Reduction and Topic Modeling: From Latent Semantic Indexing to Latent Dirichlet Allocation and Beyond , 2012, Mining Text Data.

[14]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[15]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[16]  Hongping Cai,et al.  Learning Linear Discriminant Projections for Dimensionality Reduction of Image Descriptors , 2011, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Sandhya Samarasinghe,et al.  Neural Networks for Applied Sciences and Engineering , 2006 .

[19]  YanShuicheng,et al.  Graph Embedding and Extensions , 2007 .

[20]  Deng Cai,et al.  Orthogonal locality preserving indexing , 2005, SIGIR '05.

[21]  D. G. Clayton,et al.  Gram‐Schmidt Orthogonalization , 1971 .

[22]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[23]  Jieping Ye,et al.  Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems , 2005, J. Mach. Learn. Res..

[24]  Weilin Huang,et al.  Nonlinear Dimensionality Reduction for Face Recognition , 2009, IDEAL.

[25]  Dimitris Achlioptas,et al.  Database-friendly random projections , 2001, PODS.

[26]  Christopher M. Bishop,et al.  Bayesian PCA , 1998, NIPS.

[27]  Francesco Camastra,et al.  Data dimensionality estimation methods: a survey , 2003, Pattern Recognit..

[28]  Jen-Tzung Chien,et al.  A new independent component analysis for speech recognition and separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[29]  Xin Jin,et al.  Machine Learning Techniques and Chi-Square Feature Selection for Cancer Classification Using SAGE Gene Expression Profiles , 2006, BioDM.

[30]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Chein-I Chang,et al.  Automatic spectral target recognition in hyperspectral imagery , 2003 .

[32]  Ye Xu,et al.  To obtain orthogonal feature extraction using training data selection , 2009, CIKM.

[33]  Trevor F. Cox,et al.  Metric multidimensional scaling , 2000 .

[34]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[35]  张振跃,et al.  Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment , 2004 .

[36]  Hai Hu,et al.  Application of Linear Discriminant Analysis in Dimensionality Reduction for Hand Motion Classification , 2012 .

[37]  E. Oja,et al.  Independent Component Analysis , 2013 .

[38]  P. Comon Independent Component Analysis , 1992 .

[39]  Juyang Weng,et al.  Candid Covariance-Free Incremental Principal Component Analysis , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[41]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[42]  Issam Dagher,et al.  Face recognition using IPCA-ICA algorithm , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[44]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[45]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[46]  Keinosuke Fukunaga 15 Intrinsic dimensionality extraction , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[47]  Chun Chen,et al.  Constrained Laplacian Eigenmap for dimensionality reduction , 2010, Neurocomputing.

[48]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[49]  Gene H. Golub,et al.  Matrix computations , 1983 .

[50]  J. Weng,et al.  Convergence Analysis of Complementary Candid Incremental Principal Component Analysis ∗ , 2001 .

[51]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[52]  Nicolas Gillis,et al.  Fast and Robust Recursive Algorithmsfor Separable Nonnegative Matrix Factorization , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  G. Golub,et al.  Linear least squares solutions by householder transformations , 1965 .

[54]  Alon Zakai,et al.  Manifold Learning: The Price of Normalization , 2008, J. Mach. Learn. Res..

[55]  Erkki Oja,et al.  Linear and Nonlinear Projective Nonnegative Matrix Factorization , 2010, IEEE Transactions on Neural Networks.

[56]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[57]  Hai Tao,et al.  Representing Images Using Nonorthogonal Haar-Like Bases , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Yu Hen Hu,et al.  Vehicle classification in distributed sensor networks , 2004, J. Parallel Distributed Comput..

[59]  Jiawei Han,et al.  Orthogonal Laplacianfaces for Face Recognition , 2006, IEEE Transactions on Image Processing.

[60]  Li Zhuo,et al.  A comparative study of dimensionality reduction methods for large-scale image retrieval , 2014, Neurocomputing.

[61]  K. Thangavel,et al.  A Novel Approach for Single Gene Selection Using Clustering and Dimensionality Reduction , 2013, ArXiv.

[62]  Chiou-Shann Fuh,et al.  Multiple Kernel Learning for Dimensionality Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[64]  Steve Bartelmaos,et al.  Fast Principal Component Extraction Using Givens Rotations , 2008, IEEE Signal Processing Letters.

[65]  Leslie S. Smith,et al.  Feature subset selection in large dimensionality domains , 2010, Pattern Recognit..