论文信息 - Exploring Association between Optimal Depth and Number of Features used in Decision Tree

Exploring Association between Optimal Depth and Number of Features used in Decision Tree

Decision Tree (DT) is a widely used predictive modelling tool that has applications spanning a wide range of areas including business, energy modelling, medicine and remote sensing. Decision Tree is a non-parametric supervised learning method used for both classification and regression tasks. Building optimal decision tree has always been an NP problem. A number of factors affect the accuracy of decision trees and a lot of research has been done in this area. In the present work we attempted to explore a relationship between number of features used for modelling and optimal depth of decision tree. Different datasets were tested and it was observed that there exists some kind of relationship between these two. The work may help reduce the testing time significantly by setting an upper bound on depth value during testing for optimal depth. IndexTerms Decision tree, predictive modelling, optimal depth, accuracy.

[1] Himani Sharma,et al. A Survey on Decision Tree Algorithms of Classification in Data Mining , 2016 .

[2] Detlef Sieling. Minimization of decision trees is hard to approximate , 2008, J. Comput. Syst. Sci..

[3] Yingqian Zhang,et al. Learning Optimal Classification Trees Using a Binary Linear Program Formulation , 2019, BNAIC/BENELEARN.

[4] Limsoon Wong,et al. DATA MINING TECHNIQUES , 2003 .

[5] Siegfried Nijssen,et al. PyDL8.5: a Library for Learning Optimal Decision Trees , 2020, IJCAI.

[6] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[7] Dimitris Bertsimas,et al. Optimal classification trees , 2017, Machine Learning.

[8] Matthew Crosby,et al. Association for the Advancement of Artificial Intelligence , 2014 .

[9] Prakruthi,et al. Insight into , 2016 .