Extended Space Decision Tree

An extension of the attribute space of a dataset typically increases the prediction accuracy of a decision tree built for this dataset. Often attribute space is extended by randomly combining two or more attributes. In this paper, we propose a novel approach for the space extension where we only choose the combined attributes that have high classification capacity. We expect the inclusion of these attributes in the attribute space increases the prediction capacity of the trees built from the datasets with the extended space. We conduct experiments on five datasets coming from the UCI machine learning repository. Our experimental results indicate that the proposed space extension leads to the tree of higher accuracy than the case where original attribute space is used. Moreover, the experimental results demonstrate a clear superiority of the proposed technique over an existing space extension technique.

[1]  Lukasz A. Kurgan,et al.  CAIM discretization algorithm , 2004, IEEE Transactions on Knowledge and Data Engineering.

[2]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[3]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[4]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[5]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[6]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[7]  S. Kotsiantis,et al.  Discretization Techniques: A recent survey , 2006 .

[8]  Md Zahidul Islam,et al.  Knowledge Discovery through SysFor - a Systematically Developed Forest of Multiple Decision Trees , 2011, AusDM.

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[12]  Michael Y. Hu,et al.  Forecasting with artificial neural networks: The state of the art , 1997 .

[13]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[14]  E. Zimanyi,et al.  Defining spatio-temporal granularities for raster data , 2012 .

[15]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[16]  Md Zahidul Islam,et al.  EXPLORE: A Novel Decision Tree Classification Algorithm , 2010, BNCOD.

[17]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[18]  Gavin Brown,et al.  Random Projection Random Discretization Ensembles—Ensembles of Linear Multivariate Decision Trees , 2014, IEEE Transactions on Knowledge and Data Engineering.

[19]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[20]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[21]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[22]  Harry Zhang,et al.  A Fast Decision Tree Learning Algorithm , 2006, AAAI.

[23]  Mehmet Fatih Amasyali,et al.  Classifier Ensembles with the Extended Space Forest , 2014, IEEE Transactions on Knowledge and Data Engineering.

[24]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[27]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.