Multi Branch Decision Tree: A New Splitting Criterion

In this paper, a new splitting criterion to build a decision tree is proposed. Splitting criterion specifies the best splitting variable and its threshold for further splitting in a tree. Giving the idea from classical Forward Selection method and its enhanced versions, the variable having the largest absolute correlation with the target value is chosen as the best splitting variable in each node. Then, the idea of maximizing the margin between classes in SVM is used to find the best threshold on the selected variable to classify the data. This procedure will execute recursively in each node, until reaching the leaf nodes. The final decision tree has a comparable shorter height than the previous methods, which effectively reduces more useless variables and the time of classification for future data. Unclassified regions are also generated, which can be interpreted as an advantage or disadvantage for the proposed method. Simulation results demonstrate this improvement in the proposed decision tree.

[1]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[2]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[3]  L Todorovski,et al.  Application of decision trees to the analysis of soil radon data for earthquake prediction. , 2003, Applied radiation and isotopes : including data, instrumentation and methods for use in agriculture, industry and medicine.

[4]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.

[5]  Kyuseok Shim,et al.  PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning , 1998, Data Mining and Knowledge Discovery.

[6]  V. Ravi,et al.  Rule extraction using Support Vector Machine based hybrid classifier , 2008, TENCON 2008 - 2008 IEEE Region 10 Conference.

[7]  Hannu Koivisto,et al.  Fuzzy classifier identification using decision tree and multiobjective evolutionary algorithms , 2008, Int. J. Approx. Reason..

[8]  Fritz Wysotzki,et al.  Automatic construction of decision trees for classification , 1994, Ann. Oper. Res..

[9]  Yong-Soo Kim,et al.  Comparison of the decision tree, artificial neural network, and linear regression methods based on the number and types of independent variables and sample size , 2008, Expert Syst. Appl..

[10]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[11]  Hannu Koivisto,et al.  A Dynamically Constrained Multiobjective Genetic Fuzzy System for Regression Problems , 2010, IEEE Transactions on Fuzzy Systems.

[12]  Stig-Erland Hansen,et al.  Improving Decision Tree Pruning through Automatic Programming , 2007 .

[13]  Yen-Liang Chen,et al.  Constructing a decision tree from data with hierarchical class labels , 2009, Expert Syst. Appl..

[14]  Malik Beshir Malik,et al.  Applied Linear Regression , 2005, Technometrics.

[15]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[16]  Salvatore Ruggieri,et al.  Efficient C4.5 , 2002, IEEE Trans. Knowl. Data Eng..

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[18]  W. Loh,et al.  Tree-Structured Classification via Generalized Discriminant Analysis. , 1988 .

[19]  C. Floyd,et al.  Decision tree classification of proteins identified by mass spectrometry of blood serum samples from people with and without lung cancer , 2003, Proteomics.

[20]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[21]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[22]  Carla E. Brodley,et al.  Multivariate decision trees , 2004, Machine Learning.

[23]  Joachim Diederich,et al.  Eclectic Rule-Extraction from Support Vector Machines , 2005 .

[24]  Detlef Sieling Minimization of decision trees is hard to approximate , 2008, J. Comput. Syst. Sci..

[25]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[26]  James Bailey,et al.  ROC-tree: A Novel Decision Tree Induction Algorithm Based on Receiver Operating Characteristics to Classify Gene Expression Data , 2008, SDM.

[27]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[28]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[29]  Matt J. Aitkenhead,et al.  A co-evolving decision tree classification method , 2008, Expert Syst. Appl..

[30]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[31]  Jonathan Lawry,et al.  Decision tree learning with fuzzy labels , 2005, Inf. Sci..