To obtain orthogonal feature extraction using training data selection

Feature extraction is an effective tool in data mining and machine learning. Many feature extraction methods have been investigated recently. However, few methods can achieve orthogonal components. Non-orthogonal components distort the metric structure of original data space and contain reductant information. In this paper, we propose a feature extraction method, named as incremental orthogonal basis analysis (IOBA), to cope with the challenging endeavors. First, IOBA learns orthogonal components for original data, not only theoretically but also numerically. Second, an innovative way of training data selection is proposed. This selection scheme helps IOBA pick up numerically orthogonal components from training patterns. Third, by designing a self-adaptive threshold technique, no prior knowledge about the number of components is necessary to use IOBA. Moreover, without solving eigenvalue and eigenvector problems, IOBA not only saves large computing loads, but also avoids ill-conditioned problems. Results of experiments show the efficiency of the proposed IOBA.

[1]  Hai Tao,et al.  Representing Images Using Nonorthogonal Haar-Like Bases , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[3]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[4]  Deng Cai,et al.  Orthogonal locality preserving indexing , 2005, SIGIR '05.

[5]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[6]  Christopher M. Bishop,et al.  Bayesian PCA , 1998, NIPS.

[7]  Juyang Weng,et al.  Candid Covariance-Free Incremental Principal Component Analysis , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[9]  Jiawei Han,et al.  Learning a Maximum Margin Subspace for Image Retrieval , 2008, IEEE Transactions on Knowledge and Data Engineering.

[10]  Jen-Tzung Chien,et al.  A new independent component analysis for speech recognition and separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[12]  Issam Dagher,et al.  Face recognition using IPCA-ICA algorithm , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Golub Gene H. Et.Al Matrix Computations, 3rd Edition , 2007 .

[14]  Cor J. Veenman,et al.  LESS: a model-based classifier for sparse subspaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.