An Efficient Classification Approach for Data Mining

Decision tree is an important method for both induction research and data mining, which is mainly used for model classification and prediction. ID3 algorithm is the most widely used algorithm in the decision tree so far. In this paper, the shortcoming of ID3's inclining to choose attributes with many values is discussed, and then a new decision tree algorithm which is improved version of ID3. In our proposed algorithm attributes are divided into groups and then we apply the selection measure 5 for these groups. If information gain is not good then again divide attributes values into groups. These steps are done until we get good classification/misclassification ratio. The proposed algorithms classify the data sets more accurately and efficiently.

[1]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[2]  João Gama,et al.  Linear tree , 1999, Intell. Data Anal..

[3]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[4]  Chandra Shekhar Yadav,et al.  Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network , 2010 .

[5]  Velappa Ganapathy,et al.  Handwritten Character Recognition Using Multiscale Neural Network Training Technique , 2008 .

[6]  Liang Xu,et al.  An improved Decision Tree classification algorithm based on ID3 and the application in score analysis , 2009, 2009 Chinese Control and Decision Conference.

[7]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[8]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[9]  Jinping Li,et al.  Character Recognition Based on Hierarchical RBF Neural Networks , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[10]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[11]  N. Ramaraj,et al.  Neural Network Based Offline Tamil Handwritten Character Recognition System , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[12]  Geoffrey I. Webb,et al.  On Why Discretization Works for Naive-Bayes Classifiers , 2003, Australian Conference on Artificial Intelligence.

[13]  Singh Vijendra,et al.  Efficient Clustering for High Dimensional Data: Subspace Based Clustering and Density Based Clustering , 2011 .

[14]  Hans Zantema,et al.  Finding Small Equivalent Decision Trees is Hard , 2000, Int. J. Found. Comput. Sci..

[15]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[16]  Omaima N. A. AL-Allaf,et al.  Improving the Performance of Backpropagation Neural Network Algorithm for Image Compression/Decompression System , 2010 .

[17]  Liu Yuxun,et al.  Improved ID3 algorithm , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[18]  Pat Langley,et al.  Induction of Recursive Bayesian Classifiers , 1993, ECML.

[19]  Miao Wang,et al.  A more efficient classification scheme for ID3 , 2010, 2010 2nd International Conference on Computer Engineering and Technology.