Dimensionality Reduction

Dimensionality reduction studies methods that effectively reduce data dimensionality for efficient data processing tasks such as pattern recognition, machine learning, text retrieval, and data mining. We introduce the field of dimensionality reduction by dividing it into two parts: feature extraction and feature selection. Feature extraction creates new features resulting from the combination of the original features; and feature selection produces a subset of the original features. Both attempt to reduce the dimensionality of a dataset in order to facilitate efficient data processing tasks. We introduce key concepts of feature extraction and feature selection, describe some basic methods, and illustrate their applications with some practical cases. Extensive research into dimensionality reduction is being carried out for the past many decades. Even today its demand is further increasing due to important high-dimensional applications such as gene expression data, text categorization, and document indexing.

[1]  Anthony N. Mucciardi,et al.  A Comparison of Seven Techniques for Choosing Subsets of Pattern Recognition Properties , 1971, IEEE Transactions on Computers.

[2]  Huan Liu,et al.  Customer Retention via Data Mining , 2000, Artificial Intelligence Review.

[3]  Luis Talavera,et al.  Feature Selection as a Preprocessing Step for Hierarchical Clustering , 1999, ICML.

[4]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[5]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[6]  Ashwin Ram,et al.  Efficient Feature Selection in Conceptual Clustering , 1997, ICML.

[7]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[8]  Joshua M. Stuart,et al.  MICROARRAY EXPERIMENTS : APPLICATION TO SPORULATION TIME SERIES , 1999 .

[9]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[10]  Myron Wish,et al.  Three-Way Multidimensional Scaling , 1978 .

[11]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[12]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[13]  John W. Sammon,et al.  An Optimal Set of Discriminant Vectors , 1975, IEEE Transactions on Computers.

[14]  J. Kittler Feature selection and extraction , 1978 .

[15]  Shivakumar Vaithyanathan,et al.  Model Selection in Unsupervised Learning with Applications To Document Clustering , 1999, International Conference on Machine Learning.

[16]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[17]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[18]  Huan Liu,et al.  Feature Selection for Clustering , 2000, Encyclopedia of Database Systems.

[19]  J. A. López del Val,et al.  Principal Components Analysis , 2018, Applied Univariate, Bivariate, and Multivariate Statistics Using Python.

[20]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[21]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[22]  Jörg Kindermann,et al.  Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? , 2002, Machine Learning.

[23]  Salvatore J. Stolfo,et al.  Adaptive Intrusion Detection: A Data Mining Approach , 2000, Artificial Intelligence Review.

[24]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[25]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[26]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[27]  A. Meyer-Bäse Feature Selection and Extraction , 2004 .

[28]  Justin Doak,et al.  An evaluation of feature selection methods and their application to computer security , 1992 .

[29]  Paul Scheunders,et al.  Non-linear dimensionality reduction techniques for unsupervised feature extraction , 1998, Pattern Recognit. Lett..

[30]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[31]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[32]  Jorma Laaksonen,et al.  SOM_PAK: The Self-Organizing Map Program Package , 1996 .

[33]  Huan Liu,et al.  Feature selection for clustering - a filter solution , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[34]  Juyang Weng,et al.  Efficient content-based image retrieval using automatic feature selection , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[35]  Ramasamy Uthurusamy,et al.  EVOLVING DATA MINING INTO SOLUTIONS FOR INSIGHTS , 2002 .

[36]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[37]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[38]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[39]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[40]  Chih-Cheng Hung,et al.  Image texture classification using datagrams and characteristic views , 2003, SAC '03.

[41]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[42]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[43]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[44]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[45]  Andrew Y. Ng,et al.  Preventing "Overfitting" of Cross-Validation Data , 1997, ICML.

[46]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[47]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[48]  William H. Press,et al.  Numerical recipes in C , 2002 .

[49]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[50]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[51]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[52]  Luis Talavera Feature Selection and Incremental Learning of Probabilistic Concept Hierarchies , 2000, ICML.

[53]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[54]  Jeffrey C. Schlimmer,et al.  Efficiently Inducing Determinations: A Complete and Systematic Search Algorithm that Uses Optimal Pruning , 1993, ICML.

[55]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[56]  Carla E. Brodley,et al.  Visualization and interactive feature selection for unsupervised data , 2000, KDD '00.