Predictability-based collective class association rule mining

Associative classification is rule-based involving candidate rules as criteria of classification that provide both highly accurate and easily interpretable results to decision makers. The important phase of associative classification is rule evaluation consisting of rule ranking and pruning in which bad rules are removed to improve performance. Existing association rule mining algorithms relied on frequency-based rule evaluation methods such as support and confidence, failing to provide sound statistical or computational measures for rule evaluation, and often suffer from many redundant rules. In this research we propose predictability-based collective class association rule mining based on cross-validation with a new rule evaluation step. We measure the prediction accuracy of each candidate rule in inner cross-validation steps. We split a training dataset into inner training sets and inner test sets and then evaluate candidate rules predictive performance. From several experiments, we show that the proposed algorithm outperforms some existing algorithms while maintaining a large number of useful rules in the classifier. Furthermore, by applying the proposed algorithm to a real-life healthcare dataset, we demonstrate that it is practical and has potential to reveal important patterns in the dataset.

[1]  Osmar R. Zaïane,et al.  Associative Classification with Statistically Significant Positive and Negative Rules , 2015, CIKM.

[2]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[3]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[4]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[5]  José Francisco Martínez Trinidad,et al.  Studying Netconf in Hybrid Rule Ordering Strategies for Associative Classification , 2014, MCPR.

[6]  Fadi A. Thabtah,et al.  MAC: A Multiclass Associative Classification Algorithm , 2012, J. Inf. Knowl. Manag..

[7]  Harris Wu,et al.  Principal Association Mining: An efficient classification approach , 2014, Knowl. Based Syst..

[8]  Peter I. Cowling,et al.  MCAR: multi-class classification based on association rule , 2005, The 3rd ACS/IEEE International Conference onComputer Systems and Applications, 2005..

[9]  José Francisco Martínez Trinidad,et al.  CAR-NF: A classifier based on specific rules with high netconf , 2012, Intell. Data Anal..

[10]  Wen-Chin Chen,et al.  Increasing the effectiveness of associative classification in terms of class imbalance by using a novel pruning algorithm , 2012, Expert Syst. Appl..

[11]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[12]  Wen-Chin Chen,et al.  Adjusting and generalizing CBA algorithm to handling class imbalance , 2012, Expert Syst. Appl..

[13]  Aladdin Ayesh,et al.  Multi-Label Rules Algorithm Based Associative Classification , 2014, Parallel Process. Lett..

[14]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[15]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[16]  Fadi A. Thabtah,et al.  Prediction Phase in Associative Classification Mining , 2011, Int. J. Softw. Eng. Knowl. Eng..

[17]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[18]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[19]  Guoqiang Han,et al.  A novel algorithm for associative classification of image blocks , 2004, The Fourth International Conference onComputer and Information Technology, 2004. CIT '04..

[20]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[21]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[22]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[23]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[24]  Elena Baralis,et al.  A lazy approach to pruning classification rules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[25]  Ke Wang,et al.  Growing decision trees on support-less association rules , 2000, KDD '00.

[26]  Zhongmei Zhou A New Classification Approach Based on Multiple Classification Rules , 2014 .

[27]  Fadi A. Thabtah,et al.  A review of associative classification mining , 2007, The Knowledge Engineering Review.

[28]  Yiming Ma,et al.  Improving an Association Rule Based Classifier , 2000, PKDD.

[29]  Peter I. Cowling,et al.  MMAC: a new multi-class, multi-label associative classification approach , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[30]  A.C.M. Fong,et al.  Prediction confidence for associative classification , 2005, 2005 International Conference on Machine Learning and Cybernetics.