An Optimal Constrained Pruning Strategy for Decision Trees

This paper is concerned with the optimal constrained pruning of decision trees. We present a novel 0--1 programming model for pruning the tree to minimize some general penalty function based on the resulting leaf nodes, and show that this model possesses a totally unimodular structure that enables it to be solved as a shortest-path problem on an acyclic graph. Moreover, we prove that this problem can be solved in strongly polynomial time while incorporating an additional constraint on the number of residual leaf nodes. Furthermore, the framework of the proposed modeling approach renders it suitable to accommodate different (multiple) objective functions and side-constraints, and we identify various such modeling options that can be applied in practice. The developed methodology is illustrated using a numerical example to provide insights, and some computational results are presented to demonstrate the efficacy of solving generically constrained problems of this type. We also apply this technique to a large-scale transportation analysis and simulation system (TRANSIMS), and present related computational results using real data to exhibit the flexibility and effectiveness of the proposed approach.

[1]  Kristin P. Bennett,et al.  Decision Tree Construction Via Linear Programming , 1992 .

[2]  Hanif D. Sherali,et al.  Linear programming and network flows (2nd ed.) , 1990 .

[3]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[4]  James T. C. Teng,et al.  A Dynamic Programming Based Pruning Method for Decision Trees , 2001, INFORMS J. Comput..

[5]  A. Soyster,et al.  Preemptive and nonpreemptive multi-objective programming: Relationship and counterexamples , 1983 .

[6]  Giandomenico Spezzano,et al.  Improving induction decision trees with parallel genetic programming , 2002, Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing.

[7]  S. Raghavan,et al.  Diversification for better classification trees , 2006, Comput. Oper. Res..

[8]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[9]  Hanif D. Sherali,et al.  Linear Programming and Network Flows , 1977 .

[10]  Giandomenico Spezzano,et al.  A Cellular Genetic Programming Approach to Classification , 1999, GECCO.

[11]  Kyuseok Shim,et al.  PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning , 1998, Data Mining and Knowledge Discovery.

[12]  Ravindra K. Ahuja,et al.  Network Flows: Theory, Algorithms, and Applications , 1993 .

[13]  David W. Aha,et al.  Simplifying decision trees: A survey , 1997, The Knowledge Engineering Review.

[14]  Giandomenico Spezzano,et al.  Parallel genetic programming for decision tree induction , 2001, Proceedings 13th IEEE International Conference on Tools with Artificial Intelligence. ICTAI 2001.

[15]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[16]  Steven L. Salzberg,et al.  On growing better decision trees from data , 1996 .

[17]  Nong Ye,et al.  The Handbook of Data Mining , 2003 .

[18]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[19]  Kristin P. Bennett,et al.  Global Tree Optimization: A Non-greedy Decision Tree Algorithm , 2007 .

[20]  S. Raghavan,et al.  Genetically Engineered Decision Trees: Population Diversity Produces Smarter Trees , 2003, Oper. Res..

[21]  Yi Zhang,et al.  Decision Tree Pruning via Integer Programming , 2005 .

[22]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[23]  S. Raghavan,et al.  A Genetic Algorithm-Based Approach for Building Accurate Decision Trees , 2003, INFORMS J. Comput..

[24]  Kristin P. Bennett,et al.  Feature minimization within decision trees , 1998 .

[25]  Philip A. Chou,et al.  Optimal pruning with applications to tree-structured source coding and modeling , 1989, IEEE Trans. Inf. Theory.

[26]  Hanif D. Sherali On the equivalence between some shortest path algorithms , 1991, Oper. Res. Lett..

[27]  Jorma Rissanen,et al.  MDL-Based Decision Tree Pruning , 1995, KDD.

[28]  James A. Storer,et al.  Optimal Pruning for Tree-Structured Vector Quantization , 1992, Inf. Process. Manag..

[29]  David A. Landgrebe,et al.  A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..