A GRASP method for building classification trees

This paper proposes a new method for constructing binary classification trees. The aim is to build simple trees, i.e. trees which are as less complex as possible, thereby facilitating interpretation and favouring the balance between optimization and generalization in the test data sets. The proposed method is based on the metaheuristic strategy known as GRASP in conjunction with optimization tasks. Basically, this method modifies the criterion for selecting the attributes that determine the split in each node. In order to do so, a certain amount of randomisation is incorporated in a controlled way. We compare our method with the traditional method by means of a set of computational experiments. We conclude that the GRASP method (for small levels of randomness) significantly reduces tree complexity without decreasing classification accuracy.

[1]  Chih-Fong Tsai,et al.  Earnings management prediction: A pilot study of combining neural networks and decision trees , 2009, Expert Syst. Appl..

[2]  David G. Stork,et al.  Pattern Classification , 1973 .

[3]  Marc Hofmann,et al.  Efficient algorithms for computing the best subset regression models for large-scale problems , 2007, Comput. Stat. Data Anal..

[4]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[5]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  Joaquín A. Pacheco,et al.  A scatter search approach for the minimum sum-of-squares clustering problem , 2005, Comput. Oper. Res..

[8]  Alan S. Abrahams,et al.  Inducing a marketing strategy for a new pet insurance company using decision trees , 2009, Expert Syst. Appl..

[9]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[10]  Gianluca Antonini,et al.  Subagging for credit scoring models , 2010, Eur. J. Oper. Res..

[11]  Ester Yen,et al.  Relaxing instance boundaries for the search of splitting points of numerical attributes in classification trees , 2007, Inf. Sci..

[12]  Joaquín A. Pacheco,et al.  Design of hybrids for the minimum sum-of-squares clustering problem , 2003, Comput. Stat. Data Anal..

[13]  Wolfgang Müller,et al.  Applying decision tree methodology for rules extraction under cognitive constraints , 2002, Eur. J. Oper. Res..

[14]  Matt J. Aitkenhead,et al.  A co-evolving decision tree classification method , 2008, Expert Syst. Appl..

[15]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[16]  Bert De Reyck,et al.  Project options valuation with net present value and decision tree analysis , 2008, Eur. J. Oper. Res..

[17]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[18]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[19]  Parag C. Pendharkar,et al.  A data mining-constraint satisfaction optimization problem for cost effective classification , 2006, Comput. Oper. Res..

[20]  Yen-Liang Chen,et al.  Using decision trees to summarize associative classification rules , 2009, Expert Syst. Appl..

[21]  Mauricio G. C. Resende,et al.  Greedy Randomized Adaptive Search Procedures , 1995, J. Glob. Optim..

[22]  Mauricio G. C. Resende,et al.  Greedy Randomized Adaptive Search Procedures , 1995, J. Glob. Optim..

[23]  Xindong Wu,et al.  Induction By Attribute Elimination , 1999, IEEE Trans. Knowl. Data Eng..

[24]  Erricos John Kontoghiorghes,et al.  Efficient strategies for deriving the subset VAR models , 2005, Comput. Manag. Sci..

[25]  Jaekyung Yang,et al.  Optimization-based feature selection with adaptive instance sampling , 2006, Comput. Oper. Res..

[26]  David L. Woodruff,et al.  Experiments with, and on, algorithms for maximum likelihood clustering , 2004, Comput. Stat. Data Anal..

[27]  Xiaonan Li,et al.  Operations research and data mining , 2008, Eur. J. Oper. Res..

[28]  J. Orestes Cerdeira,et al.  Computational aspects of algorithms for variable selection in the context of principal components , 2004, Comput. Stat. Data Anal..

[29]  Joaquín A. Pacheco,et al.  Analysis of new variable selection methods for discriminant analysis , 2006, Comput. Stat. Data Anal..

[30]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[31]  Abraham P. Punnen,et al.  Learning multicriteria fuzzy classification method PROAFTN from data , 2007, Comput. Oper. Res..

[32]  Joaquín A. Pacheco,et al.  A variable selection method based on Tabu search for logistic regression models , 2009, Eur. J. Oper. Res..

[33]  Celso C. Ribeiro,et al.  Greedy Randomized Adaptive Search Procedures , 2003, Handbook of Metaheuristics.

[34]  Yannis Manolopoulos,et al.  Data Mining techniques for the detection of fraudulent financial statements , 2007, Expert Syst. Appl..

[35]  M. Resende,et al.  A probabilistic heuristic for a computationally difficult set covering problem , 1989 .

[36]  Reiner Horst,et al.  A new simplicial cover technique in constrained global optimization , 1992, J. Glob. Optim..

[37]  Yen-Liang Chen,et al.  Constructing a decision tree from data with hierarchical class labels , 2009, Expert Syst. Appl..

[38]  Erricos John Kontoghiorghes,et al.  Parallel algorithms for computing all possible subset regression models using the QR decomposition , 2003, Parallel Comput..

[39]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[40]  Belén Melián-Batista,et al.  Solving feature subset selection problem by a Parallel Scatter Search , 2006, Eur. J. Oper. Res..

[41]  Pierre Hansen,et al.  Fuzzy J-Means: a new heuristic for fuzzy clustering , 2001, Pattern Recognit..

[42]  Peter Winker,et al.  Applications of optimization heuristics to estimation and modelling problems , 2004, Comput. Stat. Data Anal..

[43]  Francesco Battaglia,et al.  Fitting piecewise linear threshold autoregressive models by means of genetic algorithms , 2004, Comput. Stat. Data Anal..

[44]  George Kapetanios,et al.  Variable selection in regression models using nonstandard optimisation of information criteria , 2007, Comput. Stat. Data Anal..