CLUSTERING-BASED DECISION TREE CLASSIFIER CONSTRUCTION

Abstract This article studies data structure investigation possibilities using cluster analysis. Density structures within classes are explored to implement class decomposition in order to enhance performance of decision tree classifiers. Classes are decomposed using cluster analysis and cluster merge evaluation using decision tree classifiers. Then impact of class decomposition is shown on C4.5 and CART classifiers. The main focus is on experiments carried out with real‐valued data sets. The experiments are described in a step‐by‐step manner to illustrate the patterns discovered which affect previously proposed patterns in class decomposition methodology.

[1]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[2]  Carl G. Looney,et al.  Interactive clustering and merging with a new fuzzy expected value , 2002, Pattern Recognit..

[3]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[4]  J. Ross Quinlan,et al.  Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[5]  Jan M. Zytkow,et al.  Handbook of Data Mining and Knowledge Discovery , 2002 .

[6]  Arkady Borisov,et al.  Fuzzy classification based on pattern projections analysis , 2001, Pattern Recognit..

[7]  Elena Deza,et al.  Encyclopedia of Distances , 2014 .

[8]  Ricardo Vilalta,et al.  Identifying and Characterizing Class Clusters to Explain Learning Performance , 2006, AAAI Spring Symposium: What Went Wrong and Why: Lessons from AI Research and Applications.

[9]  Christoph F. Eick,et al.  Class decomposition via clustering: a new framework for low-variance classifiers , 2003, Third IEEE International Conference on Data Mining.

[10]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[11]  Christoph F. Eick,et al.  On supervised density estimation techniques and their application to spatial data mining , 2007, GIS.

[12]  Qin He,et al.  A Review of Clustering Algorithms as Applied in IR , 1999 .

[13]  N. Thomaidis,et al.  Decision Making Using Fuzzy C-means and Inductive Machine Learning for Managing Bank Branches Performance , 1999 .

[14]  Carlos Soares,et al.  Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results , 2003, Machine Learning.

[15]  Simon Kasif,et al.  Local Induction of Decision Trees: Towards Interactive Data Mining , 1996, KDD.

[16]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[17]  R. Quinlan,et al.  Decision tree discovery , 1999 .

[18]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[19]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[20]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[21]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.