MML Inference of Decision Graphs with Multi-way Joins and Dynamic Attributes

A decision tree is a comprehensible representation that has been widely used in many machine learning domains. But in the area of supervised learning, decision trees have their limitations. Two notable problems are those of replication and fragmentation. One way of solving these problems is to introduce decision graphs, a generalization of the decision tree, which address the above problems by allowing for disjunctions, or joins. While various decision graph systems are available, all of these systems impose some forms of restriction on the proposed representations, often leading to either a new redundancy or the original redundancy not being removed. In this paper, we propose an unrestricted representation called the decision graph with multiway joins, which has improved representative power and is able to use training data efficiently. An algorithm to infer these decision graphs with multi-way joins using the Minimum Message Length (MML) principle is also introduced. On both real-world and artificial data with only discrete attributes (including at least five UCI data-sets), and in terms of both "right"/"wrong" classification accuracy and I.J. Good's logarithm of probability "bit-costing" predictive accuracy, our novel multi-way join decision graph program significantly out-performs both C4.5 and C5.0. Our program also out-performs the Oliver and Wallace binary join decision graph program on the only data-set available for comparison.

[1]  Yishay Mansour,et al.  Boosting Using Branching Programs , 2000, J. Comput. Syst. Sci..

[2]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[3]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[4]  David L. Dowe,et al.  MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions , 2000, Stat. Comput..

[5]  I. Good Corroboration, Explanation, Evolving Probability, Simplicity and a Sharpened Razor , 1968, The British Journal for the Philosophy of Science.

[6]  Jorma Rissanen,et al.  MDL-Based Decision Tree Pruning , 1995, KDD.

[7]  David L. Dowe,et al.  A decision graph explanation of protein secondary structure prediction , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[8]  David L. Dowe,et al.  Minimum Message Length and Kolmogorov Complexity , 1999, Comput. J..

[9]  C. S. Wallace,et al.  Estimation and Inference by Compact Coding , 1987 .

[10]  C. S. Wallace,et al.  Coding Decision Trees , 1993, Machine Learning.

[11]  Ron Kohavi,et al.  Bottom-Up Induction of Oblivious Read-Once Decision Graphs: Strengths and Limitations , 1994, AAAI.

[12]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[13]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[14]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .