Feature Subset Selection with TAR2less

A repeated empirical result is that machine learners can learn adequate models using a small subset of the available features. Learning from such subsets can be faster, and produces simpler models. In this paper we present a new method for feature subset selection using the TAR2 treatment learner. TAR2 assumes small backbones; i.e. a small number of features will suffice for selecting preferred classes. TAR2 can be used as a pre-processor to other learners for identifying useful feature subsets. When compared to other methods described in a recent survey by Hall and Holmes (in press), TAR2 found the smallest subsets, with minimal or no change in classification accuracy.

[1]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[2]  Stephen D. Bay,et al.  Detecting change in categorical data: mining contrast sets , 1999, KDD '99.

[3]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[4]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[5]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[6]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[7]  Andrew J. Parkes,et al.  Clustering at the Phase Transition , 1997, AAAI/IAAI.

[8]  Ada Wai-Chee Fu,et al.  Mining association rules with weighted items , 1998, Proceedings. IDEAS'98. International Database Engineering and Applications Symposium (Cat. No.98EX156).

[9]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[10]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[11]  Ke Wang,et al.  Mining confident rules without support requirement , 2001, CIKM '01.

[12]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[13]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[14]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[15]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[16]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[17]  Tim Menzies,et al.  Practical large scale what-if queries: case studies with software risk assessment , 2000, Proceedings ASE 2000. Fifteenth IEEE International Conference on Automated Software Engineering.

[18]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[19]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[20]  Susan T. Dumais,et al.  Inductive learning algorithms and representations for text categorization , 1998, CIKM '98.

[21]  Tim Menzies,et al.  Many Maybes Mean (Mostly) the Same Thing , 2004 .