Accuracy Prediction for Distributed Decision Tree using Machine Learning approach

Machine Learning is one of the finest fields of Computer Science world which has given the innumerable and invaluable solutions to the mankind to solve its complex problems. Decision Tree is one such modern solution to the decision making problems by learning the data from the problem domain and building a model which can be used for prediction supported by the systematic analytics. In order to build a model on a huge dataset Decision Tree algorithm needs to be transformed to manifest itself into distributed environment so that higher performance of training the model is achieved in terms of time, without compromising the accuracy of the Decision Tree built. In this paper, we have proposed an enhanced version of distributed decision tree algorithm to perform better in terms of model building time without compromising the accuracy.

[1]  Roberto J. Bayardo,et al.  PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce , 2009, Proc. VLDB Endow..

[2]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[4]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[5]  Sanjay Chaudhary,et al.  Distributed Decision Tree , 2016, COMPUTE.

[6]  Xindong Wu,et al.  MReC4.5: C4.5 Ensemble Classification with MapReduce , 2009, 2009 Fourth ChinaGrid Annual Conference.

[7]  Sanjay Chaudhary,et al.  Distributed decision tree v.2.0 , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[8]  David A. Landgrebe,et al.  A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..

[9]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.