A Survey of Logic Based Classifiers

Classification is most challenging and innovative problem in data mining. Classification techniques had been focus of research since years. Logic, perception, instance and statistical concepts based classifiers are available to resolve the classification problem. This work is about the logic based classifiers known as decision tree classifiers because these use logic based algorithms to classify data on the basis of feature values. A splitting criterion on attributes is used to generate the tree. A classifier can be implemented serially or in parallel depending upon the size of data set. Some of the classifiers such as SLIQ, SPRINT, CLOUDS, BOAT and Rainforest have the capability of parallel implementation. IDE 3, CART, C4.5 and C5.0 are serial classifiers. Building phase has more importance in some classifiers to improve the scalability along with quality of the classifier. This study will provide an overview of different logic based classifiers and will compare these against our pre-defined criteria. We conclude that SLIQ and SPRINT are suitable for larger data sets whereas C4.5 and C5.0 are best suited for smaller data sets.

[1]  Johannes Gehrke,et al.  BOAT—optimistic decision tree construction , 1999, SIGMOD '99.

[2]  Sati Mazumdar,et al.  Elegant decision tree algorithm for classification in data mining , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering (Workshops), 2002..

[3]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[4]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.

[5]  R. Lewis An Introduction to Classification and Regression Tree (CART) Analysis , 2000 .

[6]  Sanjay Ranka,et al.  CLOUDS: A Decision Tree Classifier for Large Datasets , 1998, KDD.

[7]  Kyuseok Shim,et al.  PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning , 1998, Data Mining and Knowledge Discovery.

[8]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[9]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[10]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[11]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[12]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[13]  J. Ross Quinlan,et al.  Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[14]  JOHANNES GEHRKE,et al.  RainForest—A Framework for Fast Decision Tree Construction of Large Datasets , 1998, Data Mining and Knowledge Discovery.

[15]  Xindong Wu,et al.  The Top Ten Algorithms in Data Mining , 2009 .

[16]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[17]  RamakrishnanRaghu,et al.  BOAToptimistic decision tree construction , 1999 .