Exploring Association between Optimal Depth and Number of Features used in Decision Tree

Decision Tree (DT) is a widely used predictive modelling tool that has applications spanning a wide range of areas including business, energy modelling, medicine and remote sensing. Decision Tree is a non-parametric supervised learning method used for both classification and regression tasks. Building optimal decision tree has always been an NP problem. A number of factors affect the accuracy of decision trees and a lot of research has been done in this area. In the present work we attempted to explore a relationship between number of features used for modelling and optimal depth of decision tree. Different datasets were tested and it was observed that there exists some kind of relationship between these two. The work may help reduce the testing time significantly by setting an upper bound on depth value during testing for optimal depth. IndexTerms Decision tree, predictive modelling, optimal depth, accuracy.