Unsupervised Feature Selection Using Incremental Least Squares

An unsupervised feature selection method is proposed for analysis of datasets of high dimensionality. The least square error (LSE) of approximating the complete dataset via a reduced feature subset is proposed as the quality measure for feature selection. Guided by the minimization of the LSE, a kernel least squares forward selection algorithm (KLS-FS) is developed that is capable of both linear and non-linear feature selection. An incremental LSE computation is designed to accelerate the selection process and, therefore, enhances the scalability of KLS-FS to high-dimensional datasets. The superiority of the proposed feature selection algorithm, in terms of keeping principal data structures, learning performances in classification and clustering applications, and robustness, is demonstrated using various real-life datasets of different sizes and dimensions.

[1]  Huan Liu,et al.  Feature selection for clustering - a filter solution , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[2]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[3]  GANG KOU,et al.  Structural Model for Determining Enterprise Group’s Integrated Lines of Credit , 2011, Int. J. Inf. Technol. Decis. Mak..

[4]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[5]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[6]  Yong Shi The Research Trend of Information Technology and Decision Making in 2009 , 2010, Int. J. Inf. Technol. Decis. Mak..

[7]  Huan Liu,et al.  Redundancy based feature selection for microarray data , 2004, KDD.

[8]  Fred W. Glover,et al.  New Optimization Models for Data Mining , 2006, Int. J. Inf. Technol. Decis. Mak..

[9]  Alan Dove,et al.  Screening for content—the evolution of high throughput , 2003, Nature Biotechnology.

[10]  Christos Faloutsos,et al.  On the 'Dimensionality Curse' and the 'Self-Similarity Blessing' , 2001, IEEE Trans. Knowl. Data Eng..

[11]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[12]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[13]  Chih-Chung Lo,et al.  A Time-Interval Sequential Pattern Change Detection Method , 2011, Int. J. Inf. Technol. Decis. Mak..

[14]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[15]  Ling Huang,et al.  Fast approximate spectral clustering , 2009, KDD.

[16]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[17]  Ying Liu,et al.  High Utility Itemsets Mining , 2010, Int. J. Inf. Technol. Decis. Mak..

[18]  Jian Yang,et al.  Evaluate Dissimilarity of Samples in Feature Space for Improving KPCA , 2011, Int. J. Inf. Technol. Decis. Mak..

[19]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[20]  Honggang Wang,et al.  Empirical Evaluation of Classifiers for Software Risk Management , 2009, Int. J. Inf. Technol. Decis. Mak..

[21]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Tommy W. S. Chow,et al.  Efficiently searching the important input variables using Bayesian discriminant , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[23]  Yiu-ming Cheung,et al.  Local Kernel Regression Score for Selecting Features of High-Dimensional Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[24]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[25]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[26]  Heiko Hoffmann,et al.  Kernel PCA for novelty detection , 2007, Pattern Recognit..

[27]  Christos Faloutsos,et al.  Fast feature selection using fractal dimension , 2010, J. Inf. Data Manag..

[28]  JIANPING LI,et al.  Feature Selection via Least Squares Support Feature Machine , 2007, Int. J. Inf. Technol. Decis. Mak..

[29]  Yi Peng,et al.  Ensemble of Software Defect Predictors: an AHP-Based Evaluation Method , 2011, Int. J. Inf. Technol. Decis. Mak..

[30]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[31]  Christos Faloutsos,et al.  A fast and effective method to find correlations among attributes in databases , 2007, Data Mining and Knowledge Discovery.

[32]  Ambuj K. Singh,et al.  Dimensionality reduction for similarity searching in dynamic databases , 1998, SIGMOD '98.

[33]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[34]  Yi Peng,et al.  Data Mining via Multiple Criteria Linear Programming: Applications in Credit Card Portfolio Management , 2002, Int. J. Inf. Technol. Decis. Mak..

[35]  W. Krzanowski Selection of Variables to Preserve Multivariate Data Structure, Using Principal Components , 1987 .

[36]  Yun Li,et al.  Fuzzy feature selection based on min-max learning rule and extension matrix , 2008, Pattern Recognit..

[37]  Kezhi Mao,et al.  Identifying critical variables of principal components for unsupervised feature selection , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[38]  Zhengxin Chen,et al.  A Descriptive Framework for the Field of Data Mining and Knowledge Discovery , 2008, Int. J. Inf. Technol. Decis. Mak..

[39]  Padraig Cunningham,et al.  Overfitting in Wrapper-Based Feature Subset Selection: The Harder You Try the Worse it Gets , 2004, SGAI Conf..

[40]  Maciej Modrzejewski,et al.  Feature Selection Using Rough Sets Theory , 1993, ECML.

[41]  Pavel Pudil,et al.  Novel Methods for Subset Selection with Respect to Problem Knowledge , 1998, IEEE Intell. Syst..

[42]  Jieping Ye,et al.  Generalized Low Rank Approximations of Matrices , 2005, Machine Learning.

[43]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[44]  Lei Liu,et al.  Feature selection with dynamic mutual information , 2009, Pattern Recognit..

[45]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Zhengxin Chen,et al.  Classifying Credit Card Accounts for Business Intelligence and Decision Making: a Multiple-criteria Quadratic Programming Approach , 2005, Int. J. Inf. Technol. Decis. Mak..

[47]  Margaret H. Dunham,et al.  Data Mining: Introductory and Advanced Topics , 2002 .

[48]  Christos Faloutsos,et al.  Deflating the dimensionality curse using multiple fractal dimensions , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[49]  Kimito Funatsu,et al.  The Recent Trend in QSAR Modeling - Variable Selection and 3D-QSAR Methods , 2007 .

[50]  Thomas G. Dietterich,et al.  Learning Boolean Concepts in the Presence of Many Irrelevant Features , 1994, Artif. Intell..

[51]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[52]  Manoranjan Dash,et al.  Distance Based Feature Selection for Clustering Microarray Data , 2008, DASFAA.

[53]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[54]  Tommi S. Jaakkola,et al.  Feature Selection and Dualities in Maximum Entropy Discrimination , 2000, UAI.

[55]  S. Billings,et al.  Feature Subset Selection and Ranking for Data Dimensionality Reduction , 2007 .

[56]  Lior Wolf,et al.  Feature selection for unsupervised and supervised inference: the emergence of sparsity in a weighted-based approach , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[57]  Zhengxin Chen,et al.  From Data Mining to Behavior Mining , 2006, Int. J. Inf. Technol. Decis. Mak..

[58]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[59]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[60]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[61]  Xindong Wu,et al.  10 Challenging Problems in Data Mining Research , 2006, Int. J. Inf. Technol. Decis. Mak..

[62]  Sushmita Mitra,et al.  Feature Selection Using Rough Sets , 2006, Multi-Objective Machine Learning.