Performance Evaluation of Associative Classifiers in Perspective of Discretization Methods

A R T I C L E I N F O A B S T R A C T Article history: Received: 22 April, 2017 Accepted: 16 May, 2017 Online: 19 June, 2017 Discretization is the process of converting numerical values into categorical values. Contemporary literature study reveals that there are many techniques available for numerical data discretization. The performance of classification method is dependent on the exploitation of the data discretizing method. In this article, we investigate the effect of discretization methods on the performance of associative classifiers. Most of the classification approaches work on the discretized databases. There are various approaches exploited for the discretization of the database to compare the performance of the classifiers. The selection of the discretization method greatly influences the classification performance of the classification method. We compare the performance of associative classifiers namely CBA and CBA2 on the selective discretizing methods i.e. 1R Discretizer (1R-D), Ameva Discretizer (Ameva-D), Bayesian Discretizer (Bayesian-D), Discretization algorithm based on Class-Attribute Contingency Coefficient (CACC-D), Class-Attribute Dependent Discretizer (CADD-D), Distribution-Index-Based Discretizer (DIBD-D), Cluster Analysis (ClusterAnalysis-D), Chi-Merge Discretizer (ChiMerge-D) and Chi2 Discretizer (Chi2-D) in terms of accuracy. The main object of this study is to investigate the impact of discretizing method on the performance of the Associative Classifier by keeping constant other experimental parameters. Our experimental results show that the performance of the Associative Classifier significantly varies with the change of data discretization method. So the accuracy rate of the classifier is highly dependent on the selection of the discretization method. For this comparative performance study, we use the implementation of these methods in KEEL data mining tool on public datasets.

[1]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[2]  Jerzy W. Grzymala-Busse,et al.  Global discretization of continuous attributes as preprocessing for machine learning , 1996, Int. J. Approx. Reason..

[3]  Bing Liu,et al.  Classification Using Association Rules: Weaknesses and Enhancements , 2001 .

[4]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[5]  Thabtahfadi A review of associative classification mining , 2007 .

[6]  Qingxiang Wu,et al.  A Distribution-Index-Based Discretizer for Decision-Making with Symbolic AI Approaches , 2007, IEEE Transactions on Knowledge and Data Engineering.

[7]  S. Kotsiantis,et al.  Discretization Techniques: A recent survey , 2006 .

[8]  Lukasz A. Kurgan,et al.  CAIM discretization algorithm , 2004, IEEE Transactions on Knowledge and Data Engineering.

[9]  Xindong Wu,et al.  A Bayesian Discretizer for Real-Valued Attributes , 1996, Comput. J..

[10]  Om Prakash Vyas,et al.  Using Associative Classifiers for Predictive Analysis in Health Care Data Mining , 2010 .

[11]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[12]  Fadi A. Thabtah,et al.  A review of associative classification mining , 2007, The Knowledge Engineering Review.

[13]  Francisco Herrera,et al.  A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning , 2013, IEEE Transactions on Knowledge and Data Engineering.

[14]  Luis González Abril,et al.  Ameva: An autonomous discretization algorithm , 2009, Expert Syst. Appl..

[15]  Steven Salzberg,et al.  Programs for Machine Learning , 2004 .

[16]  Wei-Pang Yang,et al.  A discretization algorithm based on Class-Attribute Contingency Coefficient , 2008, Inf. Sci..

[17]  Andrew K. C. Wong,et al.  Class-Dependent Discretization for Inductive Learning from Continuous and Mixed-Mode Data , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[19]  Zulfiqar Ali,et al.  Comparative Study of Discretization Methods on the Performance of Associative Classifiers , 2016, 2016 International Conference on Frontiers of Information Technology (FIT).

[20]  Simon Scheider,et al.  Associative Classifiers for Predictive Analytics: Comparative Performance Study , 2008, 2008 Second UKSIM European Symposium on Computer Modeling and Simulation.

[21]  David P. Pancho,et al.  Analyzing fuzzy association rules with Fingrams in KEEL , 2014, 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[22]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[23]  Huan Liu,et al.  Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[24]  Huan Liu,et al.  Feature Selection via Discretization , 1997, IEEE Trans. Knowl. Data Eng..

[25]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.