Learning decision trees with taxonomy of propositionalized attributes

We introduce Propositionalized Attribute Taxonomy guided Decision Tree Learner (PAT-DTL), an inductive learning algorithm that exploits a taxonomy of propositionalized attributes as prior knowledge to generate compact decision trees. Since taxonomies are unavailable in most domains, we also introduce Propositionalized Attribute Taxonomy Learner (PAT-Learner) that automatically constructs taxonomy from data. Our experimental results on UCI repository data sets show that the proposed algorithms can generate a decision tree that is generally more compact than and is often comparably accurate to those produced by standard decision tree learners.

[1]  Vasant Honavar,et al.  Learning decision tree classifiers from attribute value taxonomies and partially specified data , 2003, ICML 2003.

[2]  Vasant Honavar,et al.  TRIPPER: Rule Learning Using Taxonomies , 2006, PAKDD.

[3]  James A. Hendler,et al.  Advances in High Performance Knowledge Representation , 1996 .

[4]  Vasant Honavar,et al.  A Multi-relational Decision Tree Learning Algorithm - Implementation and Experiments , 2003, ILP.

[5]  Naftali Tishby,et al.  Distributional Clustering of English Words , 1993, ACL.

[6]  Naftali Tishby,et al.  Agglomerative Information Bottleneck , 1999, NIPS.

[7]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[8]  Johannes Gehrke,et al.  CACTUS—clustering categorical data using summaries , 1999, KDD '99.

[9]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[10]  Vasant Honavar,et al.  AVT-NBL: an algorithm for learning compact and accurate naive Bayes classifiers from attribute value taxonomies and data , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[11]  Matthew Marzilli,et al.  Canonicalization of database records using adaptive similarity measures , 2007, KDD '07.

[12]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[13]  I. J. Taneja New Developments in Generalized Information Measures , 1995 .

[14]  Jon M. Kleinberg,et al.  Clustering categorical data: an approach based on dynamical systems , 2000, The VLDB Journal.

[15]  M. Menéndez,et al.  (h, Φ)-entropy differential metric , 1997 .

[16]  C. R. Rao,et al.  On the convexity of some divergence measures based on entropy functions , 1982, IEEE Trans. Inf. Theory.

[17]  D. Kerridge Inaccuracy and Inference , 1961 .

[18]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[19]  Naftali Tishby,et al.  Multivariate Information Bottleneck , 2001, Neural Computation.

[20]  Vasant Honavar,et al.  Multinomial Event Model Based Abstraction for Sequence and Text Classification , 2005, SARA.

[21]  David Haussler,et al.  Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..

[22]  Marie desJardins,et al.  Using Feature Hierarchies in Bayesian Network Learning , 2000, SARA.

[23]  Timothy W. Finin,et al.  A Target Centric Ontology for Intrusion Detection: Using DAML+OIL to Classify Intrusive Behaviors , 2004 .

[24]  C. R. Rao,et al.  Entropy differential metric, distance and divergence measures in probability spaces: A unified approach , 1982 .

[25]  James A. Hendler,et al.  Ontology-based Induction of High Level Classification Rules , 1997, DMKD.

[26]  Ron Kohavi,et al.  Applications of Data Mining to Electronic Commerce , 2000, Data Mining and Knowledge Discovery.

[27]  Andrew McCallum,et al.  Distributional clustering of words for text classification , 1998, SIGIR '98.

[28]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[29]  James A. Hendler,et al.  The next wave of the web , 2006, WWW '06.

[30]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[31]  Flemming Topsøe,et al.  Some inequalities for information divergence and related measures of discrimination , 2000, IEEE Trans. Inf. Theory.

[32]  Vasant Honavar,et al.  Generation of attribute value taxonomies from data for data-driven construction of accurate and compact classifiers , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[33]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .