Feature Selection

Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data-mining and machine-learning problems. The objectives of feature selection include building simpler and more comprehensible models, improving data-mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity-based, information-theoretical-based, sparse-learning-based, and statistical-based methods. To facilitate and promote the research in this community, we also present an open source feature selection repository that consists of most of the popular feature selection algorithms (http://featureselection.asu.edu/). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research.

[1]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[2]  S. Wright THE INTERPRETATION OF POPULATION STRUCTURE BY F‐STATISTICS WITH SPECIAL REGARD TO SYSTEMS OF MATING , 1965 .

[3]  R. Reyment,et al.  Statistics and Data Analysis in Geology. , 1988 .

[4]  David G. Stork,et al.  Pattern Classification , 1973 .

[5]  I. Gibson Statistics and Data Analysis in Geology , 1976, Mineralogical Magazine.

[6]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[7]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[8]  William H. Press,et al.  Numerical Recipes Example Book , 1989 .

[9]  W. Press,et al.  Numerical Recipes Example Book (C). , 1989 .

[10]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[11]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[12]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[13]  Bart Kosko,et al.  Neural networks for signal processing , 1992 .

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  David D. Lewis,et al.  Feature Selection and Feature Extraction for Text Categorization , 1992, HLT.

[16]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[17]  Noah E. Friedkin,et al.  Network Studies of Social Influence , 1993 .

[18]  Martine D. F. Schlag,et al.  Spectral K-Way Ratio-Cut Partitioning and Clustering , 1993, 30th ACM/IEEE Design Automation Conference.

[19]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[20]  Martine D. F. Schlag,et al.  Spectral K-way ratio-cut partitioning and clustering , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[21]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[22]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[23]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[24]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[25]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[26]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[28]  Lloyd A. Smith,et al.  Feature Selection for Machine Learning: Comparing a Correlation-Based Filter Approach to the Wrapper , 1999, FLAIRS.

[29]  John E. Moody,et al.  Data Visualization and Feature Selection: New Algorithms for Nongaussian Data , 1999, NIPS.

[30]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[31]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[32]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[33]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[34]  Robert Tibshirani,et al.  The Elements of Statistical Learning , 2001 .

[35]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[36]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[37]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[38]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  I. Jolliffe Principal Component Analysis , 2002 .

[40]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[41]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[42]  James Theiler,et al.  Online Feature Selection using Grafting , 2003, ICML.

[43]  Shimon Ullman,et al.  Object recognition with informative features and linear classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[44]  James Theiler,et al.  Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[45]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[46]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[47]  Kam D. Dahlquist,et al.  Regression Approaches for Microarray Data Analysis , 2002, J. Comput. Biol..

[48]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[49]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[50]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[51]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[52]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[53]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[54]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[55]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[56]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[57]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[58]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[59]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[60]  Aleks Jakulin Machine Learning Based on Attribute Interactions , 2005 .

[61]  Aleks Jakulin,et al.  Machine learning based on attribute interactions : phd dissertation , 2005 .

[62]  Steven A. Morris,et al.  Manifestation of emerging specialties in journal literature: A growth model of papers, references, exemplars, bibliographic coupling, cocitation, and clustering coefficient distribution: Research Articles , 2005 .

[63]  Jean Yee Hwa Yang,et al.  Gene expression Identifying differentially expressed genes from microarray experiments via statistic synthesis , 2005 .

[64]  Jing Zhou,et al.  Streaming feature selection using alpha-investing , 2005, KDD '05.

[65]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[66]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Darryl Stewart,et al.  Subband correlation and robust speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[68]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[69]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[70]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[71]  Steven A. Morris,et al.  Manifestation of emerging specialties in journal literature: A growth model of papers, references, exemplars, bibliographic coupling, cocitation, and clustering coefficient distribution , 2005, J. Assoc. Inf. Sci. Technol..

[72]  Dahua Lin,et al.  Conditional Infomax Learning: An Integrated Framework for Feature Extraction and Fusion , 2006, ECCV.

[73]  Chris H. Q. Ding,et al.  R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization , 2006, ICML.

[74]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[75]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[76]  Melanie Hilario,et al.  Knowledge and Information Systems , 2007 .

[77]  Jian Huang,et al.  BMC Bioinformatics BioMed Central Methodology article Supervised group Lasso with applications to microarray data , 2007 .

[78]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[79]  Masashi Sugiyama,et al.  Local Fisher discriminant analysis for supervised dimensionality reduction , 2006, ICML.

[80]  Philip S. Yu,et al.  Spectral clustering for multi-type relational data , 2006, ICML.

[81]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[82]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[83]  S. Kotsiantis,et al.  Discretization Techniques: A recent survey , 2006 .

[84]  Gianluca Bontempi,et al.  On the Use of Variable Complementarity for Feature Selection in Cancer Classification , 2006, EvoWorkshops.

[85]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[86]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[87]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[88]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[89]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[90]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2007 .

[91]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[92]  Mark S. Nixon,et al.  Gait Feature Subset Selection by Mutual Information , 2007, 2007 First IEEE International Conference on Biometrics: Theory, Applications, and Systems.

[93]  G. Tian,et al.  Statistical Applications in Genetics and Molecular Biology Sparse Logistic Regression with Lp Penalty for Biomarker Identification , 2011 .

[94]  Philip S. Yu,et al.  A probabilistic framework for relational clustering , 2007, KDD '07.

[95]  Anna Gambin,et al.  On consensus biomarker selection , 2007, BMC Bioinformatics.

[96]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[97]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[98]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[99]  Dong Xu,et al.  Trace Ratio vs. Ratio Trace for Dimensionality Reduction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[100]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[101]  G. Obozinski Joint covariate selection for grouped classification , 2007 .

[102]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[103]  Colas Schretter,et al.  Information-Theoretic Feature Selection in Microarray Data Using Variable Complementarity , 2008, IEEE Journal of Selected Topics in Signal Processing.

[104]  Huan Liu,et al.  Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis , 2008, FSDM.

[105]  Yiming Yang,et al.  Flexible latent variable models for multi-task learning , 2008, Machine Learning.

[106]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[107]  Murat Dundar,et al.  An Improved Multi-task Learning Approach with Applications in Medical Diagnosis , 2008, ECML/PKDD.

[108]  Driss Aboutajdine,et al.  A Powerful Feature Selection approach based on Mutual Information , 2008 .

[109]  John Blitzer,et al.  Regularized Learning with Networks of Features , 2008, NIPS.

[110]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[111]  H. Bondell,et al.  Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR , 2008, Biometrics.

[112]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[113]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[114]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[115]  Yichao Wu,et al.  Ultrahigh Dimensional Feature Selection: Beyond The Linear Model , 2009, J. Mach. Learn. Res..

[116]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.

[117]  E. Xing,et al.  Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network , 2009, PLoS genetics.

[118]  Jun Liu,et al.  Efficient Euclidean projections in linear time , 2009, ICML '09.

[119]  Bernhard Pfeifer,et al.  A new ensemble-based algorithm for identifying breath gas marker candidates in liver disease using ion molecule reaction mass spectrometry , 2009, Bioinform..

[120]  Trevor Darrell,et al.  An efficient projection for l1, ∞ regularization , 2009, ICML '09.

[121]  P. Zhao,et al.  The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[122]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[123]  Chris H. Q. Ding,et al.  Consensus group stable feature selection , 2009, KDD.

[124]  Gareth M. James,et al.  DASSO: connections between the Dantzig selector and lasso , 2009 .

[125]  Junzhou Huang,et al.  Learning with structured sparsity , 2009, ICML '09.

[126]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[127]  Mario Giacobini,et al.  Applications of Evolutionary Computing , 2009, Lecture Notes in Computer Science.

[128]  Jeremy Kubica,et al.  Parallel Large Scale Feature Selection for Logistic Regression , 2009, SDM.

[129]  Julien Mairal,et al.  Proximal Methods for Sparse Hierarchical Dictionary Learning , 2010, ICML.

[130]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[131]  Zengyou He,et al.  Stable Feature Selection for Biomarker Discovery , 2010, Comput. Biol. Chem..

[132]  Ji Zhu,et al.  Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. , 2008, The annals of applied statistics.

[133]  Ben Taskar,et al.  Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..

[134]  Eric P. Xing,et al.  Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.

[135]  Dean P. Foster,et al.  Feature Selection using Multiple Streams , 2010, AISTATS.

[136]  Lise Getoor,et al.  Active Learning for Networked Data , 2010, ICML.

[137]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[138]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[139]  Ying Cui,et al.  Convex Principal Feature Selection , 2010, SDM.

[140]  Hao Wang,et al.  Online Streaming Feature Selection , 2010, ICML.

[141]  R. Tibshirani,et al.  A note on the group lasso and a sparse group lasso , 2010, 1001.0736.

[142]  Yi Yang,et al.  Image Clustering Using Local Discriminant Models and Global Integration , 2010, IEEE Transactions on Image Processing.

[143]  Thibault Helleputte,et al.  Robust biomarker identification for cancer diagnosis with ensemble feature selection methods , 2010, Bioinform..

[144]  Jieping Ye,et al.  Moreau-Yosida Regularization for Grouped Tree Structure Learning , 2010, NIPS.

[145]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[146]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[147]  Jiawei Han,et al.  Joint Feature Selection and Subspace Learning , 2011, IJCAI.

[148]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[149]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[150]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[151]  Shuiwang Ji,et al.  SLEP: Sparse Learning with Efficient Projections , 2011 .

[152]  Lei Wang,et al.  The Effect of the Characteristics of the Dataset on the Selection Stability , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[153]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[154]  Yi Jiang,et al.  Eigenvalue Sensitive Feature Selection , 2011, ICML.

[155]  Feiping Nie,et al.  Feature Selection via Joint Embedding Learning and Sparse Regression , 2011, IJCAI.

[156]  Mohamed S. Kamel,et al.  An Efficient Greedy Method for Unsupervised Feature Selection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[157]  Feng Yang,et al.  Robust Feature Selection for Microarray Data Based on Multicriterion Fusion , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[158]  Jiawei Han,et al.  Towards feature selection in network , 2011, CIKM '11.

[159]  Jiawei Han,et al.  Correlated multi-label feature selection , 2011, CIKM '11.

[160]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[161]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[162]  Jiawei Han,et al.  Locality Preserving Feature Learning , 2012, AISTATS.

[163]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[164]  Huan Liu,et al.  Feature Selection with Linked Data in Social Media , 2012, SDM.

[165]  Yueting Zhuang,et al.  Adaptive Unsupervised Multi-view Feature Selection for Visual Concept Recognition , 2012, ACCV.

[166]  Chris H. Q. Ding,et al.  Symmetric Nonnegative Matrix Factorization for Graph Clustering , 2012, SDM.

[167]  Jieping Ye,et al.  Feature grouping and selection over an undirected graph , 2012, KDD.

[168]  Jieping Ye,et al.  Sparse methods for biomedical data , 2012, SKDD.

[169]  Jiayu Zhou,et al.  Modeling disease progression via fused sparse group lasso , 2012, KDD.

[170]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[171]  Pan Su,et al.  Feature Selection Ensemble , 2012, Turing-100.

[172]  Huan Liu,et al.  Unsupervised feature selection for linked social media data , 2012, KDD.

[173]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Exact Top-k Feature Selection via ℓ2,0-Norm Constraint , 2022 .

[174]  Xuan Li,et al.  Local and Global Discriminative Learning for Unsupervised Feature Selection , 2013, 2013 IEEE 13th International Conference on Data Mining.

[175]  Huan Liu,et al.  Unsupervised Feature Selection for Multi-View Data in Social Media , 2013, SDM.

[176]  Zheng Zhao,et al.  Massively parallel feature selection: an approach based on variance preservation , 2012, Machine Learning.

[177]  Hao Wang,et al.  Online Feature Selection with Streaming Features , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[178]  Edo Liberty,et al.  Simple and deterministic matrix sketching , 2012, KDD.

[179]  Huan Liu,et al.  ActNeT: Active Learning for Networked Texts in Microblogging , 2013, SDM.

[180]  Huan Liu,et al.  CoSelect: Feature Selection with Instance Selection for Social Media Data , 2013, SDM.

[181]  Haim Schweitzer,et al.  Pass-efficient unsupervised feature selection , 2013, NIPS.

[182]  Jing Wang,et al.  Online Group Feature Selection , 2013, IJCAI.

[183]  Jieping Ye,et al.  Efficient Methods for Overlapping Group Lasso , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[184]  ChengXiang Zhai,et al.  Robust Unsupervised Feature Selection , 2013, IJCAI.

[185]  Feiping Nie,et al.  Multi-View Clustering and Feature Learning via Structured Sparsity , 2013, ICML.

[186]  Huan Liu,et al.  Feature Selection for Clustering: A Review , 2018, Data Clustering: Algorithms and Applications.

[187]  Xindong Wu,et al.  Group Feature Selection with Streaming Features , 2013, 2013 IEEE 13th International Conference on Data Mining.

[188]  Yi Yang,et al.  A Convex Formulation for Semi-Supervised Multi-Label Feature Selection , 2014, AAAI.

[189]  Feiping Nie,et al.  Feature Selection at the Discrete Limit , 2014, AAAI.

[190]  Lei Wang,et al.  Global and Local Structure Preservation for Feature Selection , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[191]  Kilian Q. Weinberger,et al.  Gradient boosted feature selection , 2014, KDD.

[192]  Huan Liu,et al.  Feature Selection for Classification: A Review , 2014, Data Classification: Algorithms and Applications.

[193]  Huan Liu,et al.  Feature Selection for Social Media Data , 2014, TKDD.

[194]  James Bailey,et al.  Effective global approaches for mutual information based feature selection , 2014, KDD.

[195]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[196]  Lei Shi,et al.  Robust Spectral Learning for Unsupervised Feature Selection , 2014, 2014 IEEE International Conference on Data Mining.

[197]  Jiaxing Zhang,et al.  Attentional Neural Network: Feature Selection Using Cognitive Feedback , 2014, NIPS.

[198]  Huan Liu,et al.  Feature selection for classification: A review , 2014 .

[199]  Charu C. Aggarwal,et al.  Evolutionary Network Analysis , 2014, ACM Comput. Surv..

[200]  Avishek Saha,et al.  N$^3$LARS: Minimum Redundancy Maximum Relevance Feature Selection for Large and High-dimensional Data , 2014, ArXiv.

[201]  Huan Liu,et al.  An Unsupervised Feature Selection Framework for Social Media Data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[202]  Huan Liu,et al.  Discriminant Analysis for Unsupervised Feature Selection , 2014, SDM.

[203]  Ivor W. Tsang,et al.  Towards ultrahigh dimensional feature selection for big data , 2012, J. Mach. Learn. Res..

[204]  Rong Jin,et al.  Online Feature Selection and Its Applications , 2014, IEEE Transactions on Knowledge and Data Engineering.

[205]  Jian Pei,et al.  Towards Scalable and Accurate Online Feature Selection for Big Data , 2014, 2014 IEEE International Conference on Data Mining.

[206]  Trevor Hastie,et al.  Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[207]  Philip S. Yu,et al.  Efficient Partial Order Preserving Unsupervised Feature Selection on Networks , 2015, SDM.

[208]  Huan Liu,et al.  Unsupervised Streaming Feature Selection in Social Media , 2015, CIKM.

[209]  Debaditya Roy,et al.  Feature selection using Deep Neural Networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[210]  Wyeth W. Wasserman,et al.  Deep Feature Selection: Theory and Application to Identify Enhancers and Promoters , 2015, RECOMB.

[211]  Huan Liu,et al.  Embedded Unsupervised Feature Selection , 2015, AAAI.

[212]  Xindong Wu,et al.  Towards Mining Trapezoidal Data Streams , 2015, 2015 IEEE International Conference on Data Mining.

[213]  Hao Huang,et al.  Unsupervised Feature Selection on Data Streams , 2015, CIKM.

[214]  Qinghua Hu,et al.  Heterogeneous Feature Selection With Multi-Modal Deep Neural Networks and Sparse Group LASSO , 2015, IEEE Transactions on Multimedia.

[215]  Jing Wang,et al.  Online Feature Selection with Group Structure Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[216]  Jieping Ye,et al.  Multi-Layer Feature Reduction for Tree Structured Group Lasso via Hierarchical Projection , 2015, NIPS.

[217]  Liang Du,et al.  Unsupervised Feature Selection with Adaptive Structure Learning , 2015, KDD.

[218]  Habibollah Haron,et al.  Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[219]  Liu Huan,et al.  Toward Time-Evolving Feature Selection on Dynamic Networks , 2016 .

[220]  Huan Liu,et al.  Robust Unsupervised Feature Selection on Networked Data , 2016, SDM.

[221]  Kewei Cheng,et al.  FeatureMiner: A Tool for Interactive Feature Selection , 2016, CIKM.

[222]  Philip S. Yu,et al.  Unsupervised Feature Selection on Networks: A Generative View , 2016, AAAI.

[223]  Philip S. Yu,et al.  Nonlinear Joint Unsupervised Feature Selection , 2016, SDM.

[224]  Yun Fu,et al.  Robust Multi-View Feature Selection , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[225]  Ke Xu,et al.  Unsupervised Feature Selection by Heuristic Search with Provable Bounds on Suboptimality , 2016, AAAI.

[226]  Yong Fan,et al.  Direct Sparsity Optimization Based Feature Selection for Multi-Class Classification , 2016, IJCAI.

[227]  Lei Xie,et al.  FASCINATE: Fast Cross-Layer Dependency Inference on Multi-layered Networks , 2016, KDD.

[228]  Yueting Zhuang,et al.  Graph Regularized Feature Selection with Data Reconstruction , 2016, IEEE Transactions on Knowledge and Data Engineering.

[229]  Gleb Gusev,et al.  Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information , 2016, NIPS.

[230]  Aram Galstyan,et al.  Variational Information Maximization for Feature Selection , 2016, NIPS.

[231]  Ming Shao,et al.  Consensus Guided Unsupervised Feature Selection , 2016, AAAI.

[232]  Xuelong Li,et al.  Unsupervised Feature Selection with Structured Graph Optimization , 2016, AAAI.

[233]  W. Zuo,et al.  Coupled Dictionary Learning for Unsupervised Feature Selection , 2016, AAAI.

[234]  Huan Liu,et al.  Multi-Label Informed Feature Selection , 2016, IJCAI.

[235]  Philip S. Yu,et al.  Unsupervised Feature Selection by Preserving Stochastic Neighbors , 2016, AISTATS.

[236]  Huan Liu,et al.  Toward Time-Evolving Feature Selection on Dynamic Networks , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[237]  Ling Jian,et al.  Budget Online Learning Algorithm for Least Squares SVM , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[238]  Huan Liu,et al.  Reconstruction-based Unsupervised Feature Selection: An Embedded Approach , 2017, IJCAI.

[239]  Yong Fan,et al.  A General Framework for Sparsity Regularized Feature Selection via Iteratively Reweighted Least Square Minimization , 2017, AAAI.

[240]  Huan Liu,et al.  Gleaning Wisdom from the Past: Early Detection of Emerging Rumors in Social Media , 2017, SDM.

[241]  Takanori Maehara,et al.  Enumerate Lasso Solutions for Feature Selection , 2017, AAAI.

[242]  Huan Liu,et al.  Toward Personalized Relational Learning , 2017, SDM.

[243]  Huan Liu,et al.  Radar: Residual Analysis for Anomaly Detection in Attributed Networks , 2017, IJCAI.

[244]  Morteza Zadimoghaddam,et al.  Scalable Feature Selection via Distributed Diversity Maximization , 2017, AAAI.

[245]  Kewei Cheng,et al.  Unsupervised Feature Selection in Signed Social Networks , 2017, KDD.

[246]  Charu C. Aggarwal,et al.  Exploiting Hierarchical Structures for Unsupervised Feature Selection , 2017, SDM.

[247]  Jingrui He,et al.  A Randomized Approach for Crowdsourcing in the Presence of Multiple Views , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[248]  Huan Liu,et al.  Challenges of Feature Selection for Big Data Analytics , 2016, IEEE Intelligent Systems.