Why Fuzzy Decision Trees are Good Rankers

Several fuzzy extensions of decision tree induction, which is an established machine-learning method, have already been proposed in the literature. So far, however, fuzzy decision trees have almost exclusively been used for the performance task of classification. In this paper, we show that a fuzzy extension of decision trees is arguably more useful for another performance task, namely ranking. Roughly, the goal of ranking is to order a set of instances from most likely positive to most likely negative. The motivation for applying fuzzy decision trees to this problem originates from recent investigations of the ranking performance of conventional decision trees. These investigations will be continued and complemented in this paper. Our results reveal some properties that seem to be crucial for a good ranking performance-properties that are better and more naturally offered by fuzzy than by conventional decision trees. Most notably, a fuzzy decision tree produces scores in terms of membership degrees on a fine-granular scale. Using these membership degrees as a ranking criterion, a key problem of conventional decision trees is solved in an elegant way, namely the question of how to break ties between instances in the same leaf or, more generally, between equally scored instances.

[1]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[2]  Peter A. Flach,et al.  Learning Decision Trees Using the Area Under the ROC Curve , 2002, ICML.

[3]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[4]  B. Bouchon-Meunier,et al.  Fuzzy partitioning using mathematical morphology in a learning scheme , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[5]  Eyke Hüllermeier An Empirical and Formal Analysis of Decision Trees for Ranking , 2008 .

[6]  Ian Witten,et al.  Data Mining , 2000 .

[7]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[8]  U. M. Feyyad Data mining and knowledge discovery: making sense out of data , 1996 .

[9]  Cezary Z. Janikow,et al.  Fuzzy decision trees: issues and methods , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[10]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[11]  Harry Zhang,et al.  Learning probabilistic decision trees for AUC , 2006, Pattern Recognit. Lett..

[12]  Richard Weber,et al.  Fuzzy-ID3: A class of methods for automatic knowledge acquisition , 1992 .

[13]  Foster J. Provost,et al.  A Survey of Methods for Scaling Up Inductive Algorithms , 1999, Data Mining and Knowledge Discovery.

[14]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[15]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[16]  Peter A. Flach,et al.  Improving the AUC of Probabilistic Estimation Trees , 2003, ECML.

[17]  Bin Wang,et al.  Improving the Ranking Performance of Decision Trees , 2006, ECML.

[18]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[19]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[20]  Alexander G. Gray,et al.  Retrofitting Decision Tree Classifiers Using Kernel Density Estimation , 1995, ICML.

[21]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[22]  Witold Pedrycz,et al.  C-fuzzy decision trees , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[23]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[24]  Peter A. Flach,et al.  A Simple Lexicographic Ranker and Probability Estimator , 2007, ECML.

[25]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[26]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[27]  C. Ling,et al.  Decision Tree with Better Ranking , 2003, ICML.

[28]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[29]  Xizhao Wang,et al.  On the optimization of fuzzy decision trees , 2000, Fuzzy Sets Syst..

[30]  Witold Pedrycz,et al.  Designing decision trees with the use of fuzzy granulation , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[31]  Witold Pedrycz,et al.  The design of decision trees in the framework of granular data and their application to software quality models , 2001, Fuzzy Sets Syst..

[32]  Stéphan Clémençon,et al.  Approximation of the Optimal ROC Curve and a Tree-Based Ranking Algorithm , 2008, ALT.

[33]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[34]  B. Bouchon-Meunier,et al.  An adaptable system to construct fuzzy decision trees , 1999, 18th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.99TH8397).

[35]  Louis Wehenkel,et al.  A complete fuzzy decision tree technique , 2003, Fuzzy Sets Syst..

[36]  OlaruCristina,et al.  A complete fuzzy decision tree technique , 2003 .

[37]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..