Dimension Reduction and Feature Selection

Data Mining algorithms search for meaningful patterns in raw data sets. The Data Mining process requires high computational cost when dealing with large data sets. Reducing dimensionality (the number of attributed or the number of records) can effectively cut this cost. This chapter focuses a pre-processing step which removes dimension from a given data set before it is fed to a data mining algorithm. This work explains how it is often possible to reduce dimensionality with minimum loss of information. Clear dimension reduction taxonomy is described and techniques for dimension reduction are presented theoretically.

[1]  I. Jolliffe Principal Component Analysis , 2002 .

[2]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[3]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[4]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[5]  Dean P. Foster,et al.  Calibration and empirical Bayes variable selection , 2000 .

[6]  Geoffrey Holmes,et al.  Feature selection via the discovery of simple classification rules , 1995 .

[7]  Lior Rokach,et al.  Data Mining for Improving the Quality of Manufacturing: A Feature Set Decomposition Approach , 2006, J. Intell. Manuf..

[8]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[9]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[10]  Ron Kohavi,et al.  Wrappers for performance enhancement and oblivious decision graphs , 1995 .

[11]  Kenneth DeJong,et al.  Genetic algorithms as a tool for restructuring feature space representations , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[12]  Pedro M. Domingos Control-Sensitive Feature Selection for Lazy Learners , 1997, Artificial Intelligence Review.

[13]  Daryl Pregibon,et al.  A Statistical Perspective on Knowledge Discovery in Databases , 1996, Advances in Knowledge Discovery and Data Mining.

[14]  Gregory M. Provan,et al.  Efficient Learning of Selective Bayesian Network Classifiers , 1996, ICML.

[15]  C. Mallows Some Comments on Cp , 2000, Technometrics.

[16]  Ryszard S. Michalski,et al.  A Theory and Methodology of Inductive Learning , 1983, Artificial Intelligence.

[17]  Andrew W. Moore,et al.  Efficient Algorithms for Minimizing Cross Validation Error , 1994, ICML.

[18]  C. L. Mallows Some comments on C_p , 1973 .

[19]  Lior Rokach,et al.  Decomposition methodology for classification tasks: a meta decomposer framework , 2006, Pattern Analysis and Applications.

[20]  Benjamin Ward What's Wrong with Economics? , 1972 .

[21]  Thomas G. Dietterich,et al.  Efficient Algorithms for Identifying Relevant Features , 1992 .

[22]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[23]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[24]  Lior Rokach,et al.  Improving Supervised Learning by Feature Decomposition , 2002, FoIKS.

[25]  Lior Rokach,et al.  Theory and applications of attribute decomposition , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[26]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[27]  Pat Langley,et al.  Scaling to domains with irrelevant features , 1997, COLT 1997.

[28]  Lior Rokach,et al.  Selective Voting - Getting More for Less in Sensor Fusion , 2006, Int. J. Pattern Recognit. Artif. Intell..

[29]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[30]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[31]  Jude W. Shavlik,et al.  Growing Simpler Decision Trees to Facilitate Knowledge Discovery , 1996, KDD.

[32]  Michael J. Pazzani,et al.  Searching for Dependencies in Bayesian Classifiers , 1995, AISTATS.

[33]  J. E. Jackson A User's Guide to Principal Components , 1991 .

[34]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[35]  L. Rokach,et al.  Data mining by attribute decomposition with semiconductor manufacturing case study , 2001 .

[36]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[37]  Lior Rokach,et al.  Decomposition Methodology for Knowledge Discovery and Data Mining - Theory and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[38]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[39]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[40]  Lior Rokach,et al.  Information Retrieval System for Medical Narrative Reports , 2004, FQAS.

[41]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[42]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[43]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .

[44]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[45]  Lior Rokach,et al.  Space Decomposition in Data Mining: A Clustering Approach , 2002, ISMIS.

[46]  Lior Rokach,et al.  Feature set decomposition for decision trees , 2005, Intell. Data Anal..

[47]  Gregory M. Provan,et al.  Learning Bayesian Networks Using Feature Selection , 1995, AISTATS.

[48]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[49]  J. Edward Jackson,et al.  A User's Guide to Principal Components: Jackson/User's Guide to Principal Components , 2004 .

[50]  Vladimir Cherkassky,et al.  Learning from data , 1998 .

[51]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[52]  Lior Rokach,et al.  Classifier evaluation under limited resources , 2006, Pattern Recognit. Lett..

[53]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.