Optimal Decision Trees

We propose an Extreme Point Tabu Search (EPTS) algorithm that constructs globally optimal decision trees for classiication problems. Typically, decision tree algorithms are greedy. They optimize the misclassiication error of each decision sequentially. Our non-greedy approach minimizes the misclassiication error of all the decisions in the tree concurrently. Using Global Tree Optimization (GTO), we can optimize existing decision trees. This capability can be used in classiication and data mining applications to avoid overrtting, transfer knowledge, incorporate domain knowledge , and maintain existing decision trees. Our method works by xing the structure of the decision tree and then representing it as a set of disjunctive linear inequalities. An optimization problem is constructed that minimizes the errors within the disjunctive linear inequalities. To reduce the misclassiication error, a nonlinear error function is minimized over a polyhedral region. We show that it is suucient to restrict our search to the extreme points of the polyhedral region. A new EPTS algorithm is used to search the extreme points of the polyhedral region for an optimal solution. Promising computational results are given for both randomly generated and real-world problems.

[1]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[2]  Nimrod Megiddo,et al.  On the complexity of polyhedral separability , 1988, Discret. Comput. Geom..

[3]  R. Detrano,et al.  International application of a new probability algorithm for the diagnosis of coronary artery disease. , 1989, The American journal of cardiology.

[4]  Fred W. Glover,et al.  Tabu Search - Part I , 1989, INFORMS J. Comput..

[5]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[6]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  Kristin P. Bennett,et al.  Decision Tree Construction Via Linear Programming , 1992 .

[9]  Kristin P. Bennett,et al.  Bilinear separation of two sets inn-space , 1993, Comput. Optim. Appl..

[10]  Simon Kasif,et al.  OC1: A Randomized Induction of Oblique Decision Trees , 1993, AAAI.

[11]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[12]  Minghe Sun,et al.  Tabu search applied to the general fixed charge problem , 1993, Ann. Oper. Res..

[13]  Michael I. Jordan A statistical approach to decision tree modeling , 1994, COLT '94.

[14]  Kurt Jörnsten,et al.  Tabu search within a pivot and complement framework , 1994 .

[15]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[16]  Kurt Jörnsten,et al.  Tabu Search for General Zero-One Integer Programs Using the Pivot and Complement Heuristic , 1994, INFORMS J. Comput..

[17]  William Nick Street,et al.  Cancer diagnosis and prognosis via linear-programming-based machine learning , 1994 .

[18]  Steven Salzberg,et al.  Decision Tree Induction: How Effective is the Greedy Heuristic? , 1995, KDD.

[19]  Steven Salzberg,et al.  Lookahead and Pathology in Decision Tree Induction , 1995, IJCAI.

[20]  Jonathan Baxter,et al.  Learning internal representations , 1995, COLT '95.

[21]  A. Gray,et al.  Retrootting Decision Tree Classiiers Using Kernel Density Estimation , 1995 .

[22]  F. Glover Tabu Search Fundamentals and Uses , 1995 .

[23]  Kristin P. Bennett,et al.  A Parametric Optimization Method for Machine Learning , 1997, INFORMS J. Comput..

[24]  Michael T. Goodrich,et al.  On the Complexity of Optimization Problems for 3-dimensional Convex Polyhedra and Decision Trees , 1997, Comput. Geom..

[25]  Kristin P. Bennett,et al.  Hybrid extreme point tabu search , 1998, Eur. J. Oper. Res..