Hybrid Cost-Sensitive Decision Tree

Cost-sensitive decision tree and cost-sensitive naive Bayes are both new cost-sensitive learning models proposed recently to minimize the total cost of test and misclassifications. Each of them has its advantages and disadvantages. In this paper, we propose a novel cost-sensitive learning model, a hybrid cost-sensitive decision tree, called DTNB, to reduce the minimum total cost, which integrates the advantages of cost-sensitive decision tree and of the cost-sensitive naive Bayes together. We empirically evaluate it over various test strategies, and our experiments show that our DTNB outperforms cost-sensitive decision and the cost-sensitive naive Bayes significantly in minimizing the total cost of tests and misclassification based on the same sequential test strategies, and single batch strategies.

[1]  Kai Ming Ting,et al.  Inducing Cost-Sensitive Trees via Instance Weighting , 1998, PKDD.

[2]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[3]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[4]  Peter D. Turney Types of Cost in Inductive Concept Learning , 2002, ArXiv.

[5]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  Qiang Yang,et al.  Test-cost sensitive naive Bayes classification , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[8]  Qiang Yang,et al.  Decision trees with minimal costs , 2004, ICML.

[9]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[10]  Thomas G. Dietterich,et al.  Pruning Improves Heuristic Search for Cost-Sensitive Learning , 2002, ICML.

[11]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[12]  Nitesh V. Chawla,et al.  SPECIAL ISSUE ON LEARNING FROM IMBALANCED DATA SETS , 2004 .

[13]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[14]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[15]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[16]  Ming Tan,et al.  Cost-sensitive learning of classification knowledge and its applications in robotics , 2004, Machine Learning.