Relief-Based Feature Selection: Introduction and Review

Feature selection plays a critical role in biomedical data mining, driven by increasing feature dimensionality in target problems and growing interest in advanced but computationally expensive methodologies able to model complex associations. Specifically, there is a need for feature selection methods that are computationally efficient, yet sensitive to complex patterns of association, e.g. interactions, so that informative features are not mistakenly eliminated prior to downstream modeling. This paper focuses on Relief-based algorithms (RBAs), a unique family of filter-style feature selection algorithms that have gained appeal by striking an effective balance between these objectives while flexibly adapting to various data characteristics, e.g. classification vs. regression. First, this work broadly examines types of feature selection and defines RBAs within that context. Next, we introduce the original Relief algorithm and associated concepts, emphasizing the intuition behind how it works, how feature weights generated by the algorithm can be interpreted, and why it is sensitive to feature interactions without evaluating combinations of features. Lastly, we include an expansive review of RBA methodological research beyond Relief and its popular descendant, ReliefF. In particular, we characterize branches of RBA research, and provide comparative summaries of RBA algorithms including contributions, strategies, functionality, time complexity, adaptation to key data characteristics, and software availability.

[1]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[2]  Se June Hong,et al.  Use of Contextaul Information for Feature Ranking and Discretization , 1997, IEEE Trans. Knowl. Data Eng..

[3]  Igor Kononenko,et al.  ReliefF for estimation and discretization of attributes in classification, regression, and ILP probl , 1996 .

[4]  Lorenzo Beretta,et al.  Implementing ReliefF filters to extract meaningful features from genetic lifetime datasets , 2011, J. Biomed. Informatics.

[5]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[6]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[7]  Marko Robnik-Sikonja,et al.  Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF , 2004, Applied Intelligence.

[8]  Jun Yang,et al.  Orthogonal Relief Algorithm for Feature Selection , 2006, ICIC.

[9]  Igor Kononenko,et al.  Non-Myopic Feature Quality Evaluation with (R)ReliefF , 2007 .

[10]  Jason H Moore,et al.  Epistasis analysis using ReliefF. , 2015, Methods in molecular biology.

[11]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[12]  Janusz Kacprzyk,et al.  Knowledge, Information and Creativity Support Systems: Recent Trends, Advances and Solutions , 2016, Advances in Intelligent Systems and Computing.

[13]  Sebastián Ventura,et al.  Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context , 2015, Neurocomputing.

[14]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[15]  A. M. Qamar,et al.  RELIEF Algorithm and Similarity Learning for k-NN , 2011 .

[16]  Jason H. Moore,et al.  ExSTraCS 2.0: description and evaluation of a scalable learning classifier system , 2015, Evolutionary Intelligence.

[17]  Qiang Wang,et al.  Feature selection based on ReliefF and PCA for underwater sound classification , 2013, Proceedings of 2013 3rd International Conference on Computer Science and Network Technology.

[18]  Tao Li,et al.  Stable feature selection with ensembles of multi-reliefF , 2014, 2014 10th International Conference on Natural Computation (ICNC).

[19]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[20]  Edwina L. Rissland,et al.  CABOT: An Adaptive Approach to Case-Based Search , 1991, IJCAI.

[21]  Margaret J. Eppstein,et al.  Very large scale ReliefF for genome-wide association analysis , 2008, 2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[22]  Isabelle Guyon,et al.  Multivariate Non-Linear Feature Selection with Kernel Multiplicative Updates and Gram-Schmidt Relief , 2003 .

[23]  Michael K. Ng,et al.  Feature Weighting by RELIEF Based on Local Hyperplane Approximation , 2012, PAKDD.

[24]  Wilson K. Cheruiyot,et al.  A Survey and Comparative Study of Filter and Wrapper Feature Selection Techniques , 2016 .

[25]  Newton Spolaôr,et al.  Filter Approach Feature Selection Methods to Support Multi-label Learning Based on ReliefF and Information Gain , 2012, SBIA.

[26]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[27]  Hyuk-Chul Kwon,et al.  Extended Relief Algorithms in Instance-Based Feature Filtering , 2007, Sixth International Conference on Advanced Language Processing and Web Information Technology (ALPIT 2007).

[28]  Derek A. Roff,et al.  Statistical Approaches to Gene x Environment Interactions for Complex Phenotypes. Edited by Michael Windle. Cambridge (Massachusetts): MIT Press. $50.00. viii + 296 p.; ill.; index. ISBN: 978-0-262-03468-5. 2016. , 2018 .

[29]  José Carlos Cortizo,et al.  Multi Criteria Wrapper Improvements to Naive Bayes Learning , 2006, IDEAL.

[30]  Newton Spolaôr,et al.  ReliefF for Multi-label Feature Selection , 2013, 2013 Brazilian Conference on Intelligent Systems.

[31]  Isabelle Guyon,et al.  Multivariate Non-Linear Feature Selection with Kernel Methods , 2005 .

[32]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .

[33]  Bruce Draper,et al.  Feature selection from huge feature sets in the context of computer vision , 2000 .

[34]  Sašo Džeroski,et al.  Extending ReliefF for Hierarchical Multi-label Classification ? , 2013 .

[35]  Bjoern H. Menze,et al.  A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data , 2009, BMC Bioinformatics.

[36]  Xin Jin,et al.  Machine Learning Techniques and Chi-Square Feature Selection for Cancer Classification Using SAGE Gene Expression Profiles , 2006, BioDM.

[37]  Xiaoli Wang,et al.  Feature selection of medical data sets based on RS-RELIEFF , 2015, 2015 12th International Conference on Service Systems and Service Management (ICSSSM).

[38]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[39]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[40]  Ting Hu,et al.  Feature Selection for Detecting Gene-Gene Interactions in Genome-Wide Association Studies , 2018, EvoApplications.

[41]  Peter Rossmanith,et al.  Simulated Annealing , 2008, Taschenbuch der Algorithmen.

[42]  Manoranjan Dash,et al.  RELIEF-C: Efficient Feature Selection for Clustering over Noisy Data , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[43]  Randal S. Olson,et al.  PMLB: a large benchmark suite for machine learning evaluation and comparison , 2017, BioData Mining.

[44]  Jason H. Moore,et al.  Multiple Threshold Spatially Uniform ReliefF for the Genetic Analysis of Complex Human Diseases , 2013, EvoBIO.

[45]  Venu Govindaraju,et al.  Feature Selection Using Cooperative Game Theory and Relief Algorithm , 2013, KICSS.

[46]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[47]  Jason H. Moore,et al.  Using Expert Knowledge to Guide Covering and Mutation in a Michigan Style Learning Classifier System to Detect Epistasis and Heterogeneity , 2012, PPSN.

[48]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Dapeng Wu,et al.  A RELIEF Based Feature Extraction Algorithm , 2008, SDM.

[50]  Bill C. White,et al.  Differential privacy‐based evaporative cooling feature selection and classification with relief‐F and random forests , 2017, Bioinform..

[51]  Jason H. Moore,et al.  Spatially Uniform ReliefF (SURF) for computationally-efficient filtering of gene-gene interactions , 2009, BioData Mining.

[52]  Jian Li,et al.  Iterative RELIEF for feature weighting , 2006, ICML.

[53]  E. Karthikeyan,et al.  RELIEF-DISC: An Extended RELIEF Algorithm Using Discretization Approach for Continuous Features , 2011, 2011 Second International Conference on Emerging Applications of Information Technology.

[54]  Jason H. Moore,et al.  Evaporative cooling feature selection for genotypic data involving interactions , 2007, Bioinform..

[55]  Janez Demsar,et al.  Algorithms for subsetting attribute values with Relief , 2010, Machine Learning.

[56]  Huan Liu,et al.  Feature Selection for Classification: A Review , 2014, Data Classification: Algorithms and Applications.

[57]  Jason H. Moore,et al.  Genomic mining for complex disease traits with “random chemistry” , 2007, Genetic Programming and Evolvable Machines.

[58]  Kai Ye,et al.  Multi-RELIEF: a method to recognize specificity determining residues from multiple sequence alignments using a Machine-Learning approach for feature weighting , 2008, Bioinform..

[59]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[60]  Sabela Ramos,et al.  Multithreaded and Spark parallelization of feature selection filters , 2016, J. Comput. Sci..

[61]  Faouzi Mhamdi,et al.  A new algorithm relief hybrid (HRelief) for biological motifs selection , 2013, 13th IEEE International Conference on BioInformatics and BioEngineering.

[62]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[63]  Yijun Sun,et al.  Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  Manoranjan Dash,et al.  extraRelief: Improving Relief by Efficient Selection of Instances , 2007, Australian Conference on Artificial Intelligence.

[65]  M. Robnik-Sikonja Experiments with Cost-Sensitive Feature Evaluation , 2003, European Conference on Machine Learning.

[66]  Bruce A. Draper,et al.  Iterative Relief , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[67]  Bruce A. Draper,et al.  Feature selection from huge feature sets , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[68]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[69]  Antonio Arauzo-Azofra,et al.  A feature set measure based on Relief , 2004 .

[70]  Sebastián Ventura,et al.  ReliefF-ML: An Extension of ReliefF Algorithm to Multi-label Learning , 2013, CIARP.

[71]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[72]  Huan Liu,et al.  Searching for interacting features in subset selection , 2009, Intell. Data Anal..

[73]  Randal S. Olson,et al.  Benchmarking Relief-Based Feature Selection Methods , 2017, J. Biomed. Informatics.

[74]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[75]  Nikola Bogunovic,et al.  A review of feature selection methods with applications , 2015, 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[76]  L. A. Belanche,et al.  Review and Evaluation of Feature Selection Algorithms in Synthetic Problems , 2011, 1101.2320.

[77]  Gennady Agre,et al.  A Weighted Feature Selection Method for Instance-Based Classification , 2016, AIMSA.

[78]  Jason H. Moore,et al.  An Extended Michigan-Style Learning Classifier System for Flexible Supervised Learning, Classification, and Data Mining , 2014, PPSN.

[79]  Igor Kononenko,et al.  On Biases in Estimating Multi-Valued Attributes , 1995, IJCAI.

[80]  Martin Mozina,et al.  Orange: data mining toolbox in python , 2013, J. Mach. Learn. Res..

[81]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[82]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[83]  Joanna Jedrzejowicz,et al.  Imbalanced data classification using MapReduce and relief , 2018, J. Inf. Telecommun..

[84]  Bill C. White,et al.  ReliefSeq: A Gene-Wise Adaptive-K Nearest-Neighbor Feature Selection Tool for Finding Gene-Gene Interactions and Main Effects in mRNA-Seq Gene Expression Data , 2013, PloS one.

[85]  Kathryn A. Dowsland,et al.  Simulated Annealing , 1989, Encyclopedia of GIS.

[86]  Marko Robnik-Sikonja,et al.  An adaptation of Relief for attribute estimation in regression , 1997, ICML.

[87]  Jason H. Moore,et al.  Tuning ReliefF for Genome-Wide Genetic Analysis , 2007, EvoBIO.

[88]  Richard S. Sutton,et al.  Learning Polynomial Functions by Feature Construction , 1991, ML.

[89]  Randal S. Olson,et al.  Benchmarking Relief-Based Feature Selection Methods , 2017, ArXiv.

[90]  Chris H. Q. Ding,et al.  Multi-label ReliefF and F-statistic feature selections for image annotation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[91]  Mykola Pechenizkiy,et al.  ReliefF-MI: An extension of ReliefF to multiple instance learning , 2012, Neurocomputing.

[92]  Chin-Chun Chang,et al.  Generalized iterative RELIEF for supervised distance metric learning , 2010, Pattern Recognit..

[93]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[94]  Pedro Larrañaga,et al.  Feature Subset Selection by Bayesian network-based optimization , 2000, Artif. Intell..

[95]  Takeo Kanade,et al.  Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics , 2013, Lecture Notes in Computer Science.

[96]  Ramón López de Mántaras,et al.  A distance-based attribute selection measure for decision tree induction , 1991, Machine Learning.

[97]  Weizeng Ni,et al.  A Review and Comparative Study on Univariate Feature Selection Techniques , 2012 .

[98]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[99]  Philip J. Stone,et al.  Experiments in induction , 1966 .

[100]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[101]  Bruce A. Draper,et al.  Evaluating Feature Relevance: Reducing Bias in Relief , 2002, JCIS.

[102]  Qinbao Song,et al.  A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[103]  Jason H. Moore,et al.  The Informative Extremes: Using Both Nearest and Farthest Individuals Can Improve Relief Algorithms in the Domain of Human Genetics , 2010, EvoBIO.

[104]  N. Dessì,et al.  A Comparative Analysis of Biomarker Selection Techniques , 2013, BioMed research international.

[105]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[106]  Padhraic Smyth,et al.  An Information Theoretic Approach to Rule Induction from Databases , 1992, IEEE Trans. Knowl. Data Eng..

[107]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[108]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[109]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[110]  B. McKinney,et al.  Capturing the Spectrum of Interaction Effects in Genetic Association Studies by Simulated Evaporative Cooling Network Analysis , 2009, PLoS genetics.

[111]  Mohamad Khalil,et al.  New technique for feature selection: Combination between elastic net and relief , 2015, 2015 Third International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE).

[112]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[113]  Mykola Pechenizkiy,et al.  Feature selection is the ReliefF for multiple instance learning , 2010, 2010 10th International Conference on Intelligent Systems Design and Applications.

[114]  Raquel Flórez López,et al.  Reviewing RELIEF and its Extensions: A new Approach for Estimating Attributes considering high-correlated Features , 2002, Industrial Conference on Data Mining.

[115]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[116]  Shyam Visweswaran,et al.  Application of a spatially-weighted Relief algorithm for ranking genetic predictors of disease , 2012, BioData Mining.

[117]  TodorovicSinisa,et al.  Local-Learning-Based Feature Selection for High-Dimensional Data Analysis , 2010 .

[118]  Kwong-Sak Leung,et al.  Very Large Scale ReliefF Algorithm on GPU for Genome-Wide Association Study , 2015 .

[119]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[120]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[121]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[122]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[123]  Sadek Benhammada,et al.  ReliefMSS: a variation on a feature ranking ReliefF algorithm , 2009, Int. J. Bus. Intell. Data Min..

[124]  LarrañagaPedro,et al.  A review of feature selection techniques in bioinformatics , 2007 .

[125]  Xinyuan Zhang,et al.  Collective feature selection to identify crucial epistatic variants , 2018, BioData Mining.

[126]  Marko Robnik-Sikonja,et al.  Comprehensible Interpretation of Relief's Estimates , 2001, ICML.

[127]  J. Kittler,et al.  Feature Set Search Alborithms , 1978 .

[128]  Luquan Li,et al.  Relief for regression with missing data in variable selection , 2014 .

[129]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.