A classification rules mining method based on dynamic rules' frequency

Rule based classification or rule induction (RI) in data mining is an approach that normally generates classifiers containing simple yet effective rules. Most RI algorithms suffer from few drawbacks mainly related to rule pruning and rules sharing training data instances. In response to the above two issues, a new dynamic rule induction (DRI) method is proposed that utilises two thresholds to minimise the items search space. Whenever a rule is generated, DRI algorithm ensures that all candidate items' frequencies are updated to reflect the deletion of the rule's training data instances. Therefore, the remaining candidate items waiting to be added to other rules have dynamic frequencies rather static. This enables DRI to generate not only rules with 100% accuracy but rules with high accuracy as well. Experimental tests using a number of UCI data sets have been conducted using a number of RI algorithms. The results clearly show competitive performance in regards to classification accuracy and classifier size of DRI when compared to other RI algorithms.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[3]  Fadi A. Thabtah,et al.  MAC: A Multiclass Associative Classification Algorithm , 2012, J. Inf. Knowl. Manag..

[4]  Peter I. Cowling,et al.  MMAC: a new multi-class, multi-label associative classification approach , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[5]  Marcel Abendroth,et al.  Data Mining Practical Machine Learning Tools And Techniques With Java Implementations , 2016 .

[6]  Fadi A. Thabtah,et al.  Mr-arm: a Map-Reduce Association Rule Mining Framework , 2013, Parallel Process. Lett..

[7]  Fadi A. Thabtah,et al.  Parallel Associative Classification Data Mining Frameworks Based MapReduce , 2015, Parallel Process. Lett..

[8]  Fadi A. Thabtah,et al.  Phishing detection based Associative Classification data mining , 2014, Expert Syst. Appl..

[9]  T. L. McCluskey,et al.  Predicting phishing websites based on self-structuring neural network , 2013, Neural Computing and Applications.

[10]  Mo Adda,et al.  P-Prism: A Computationally Efficient Approach to Scaling up Classification Rule Induction , 2008, IFIP AI.

[11]  Dae-Won Kim,et al.  Classification Based on Predictive Association Rules of Incomplete Data , 2012, IEICE Trans. Inf. Syst..

[12]  S. Ravi,et al.  Relevant association rule mining from medical dataset using new irrelevant rule elimination technique , 2013, 2013 International Conference on Information Communication and Embedded Systems (ICICES).

[13]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[14]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[15]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[16]  Fadi Thabtah,et al.  Rule Pruning in Associative Classification Mining , 2005 .

[17]  Fadi A. Thabtah,et al.  Associative Classification Approaches: Review and Comparison , 2014, J. Inf. Knowl. Manag..