K Nearest Neighbor Edition to Guide Classification Tree Learning: Motivation and Experimental Results

This paper presents a new hybrid classifier that combines the Nearest Neighbor distance based algorithm with the Classification Tree paradigm. The Nearest Neighbor algorithm is used as a preprocessing algorithm in order to obtain a modified training database for the posterior learning of the classification tree structure; experimental section shows the results obtained by the new algorithm; comparing these results with those obtained by the classification trees when induced from the original training data we obtain that the new approach performs better or equal according to the Wilcoxon signed rank statistical test.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[3]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[4]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[5]  G. Bortolan,et al.  The problem of linguistic approximation in clinical decision making , 1988, Int. J. Approx. Reason..

[6]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[7]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[8]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[9]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Pedro Larrañaga,et al.  Prototype Selection and Feature Subset Selection by Estimation of Distribution Algorithms. A Case Study in the Survival of Cirrhotic Patients Treated with TIPS , 2001, AIME.

[11]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[12]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[13]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[14]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[15]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[16]  J. Kent Martin,et al.  An Exact Probability Metric for Decision Tree Splitting , 1995, AISTATS.

[17]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[18]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[19]  Pierre Loonis,et al.  Combination, Cooperation And Selection Of Classifiers: A State Of The Art , 2003, Int. J. Pattern Recognit. Artif. Intell..

[20]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[21]  Judea Pearl,et al.  Evidential Reasoning Using Stochastic Simulation of Causal Models , 1987, Artif. Intell..

[22]  Pedro Larrañaga,et al.  Machine Learning Inspired Approaches to Combine Standard Medical Measures at an Intensive Care Unit , 1999, AIMDM.

[23]  Basilio Sierra,et al.  Analysis of the Iterated Probabilistic Weighted K Nearest Neighbor Method, a new Distance-Based Algorithm , 2004, ICEIS.

[24]  Pedro Larrañaga,et al.  Using Bayesian networks in the construction of a bi-level multi-classifier. A case study using intensive care unit patients data , 2001, Artif. Intell. Medicine.

[25]  J. Kent Martin,et al.  An Exact Probability Metric for Decision Tree Splitting and Stopping , 1997, Machine Learning.

[26]  Yi Lu,et al.  Knowledge integration in a multiple classifier system , 2004, Applied Intelligence.

[27]  João Gama,et al.  Combining classification algorithms , 2000 .

[28]  Pedro Larrañaga,et al.  Feature Subset Selection by Bayesian network-based optimization , 2000, Artif. Intell..

[29]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.