Bi-Level Associative Classifier Using Automatic Learning on Rules

The power of associative classifiers is to determine patterns from the data and perform classification based on the features that are most indicative of prediction. Although they have emerged as competitive classification systems, associative classifiers suffer from limitations such as cumbersome thresholds requiring prior knowledge which varies with the dataset. Furthermore, ranking discovered rules during inference rely on arbitrary heuristics using functions such as sum, average, minimum, or maximum of confidence of the rules. Therefore, in this study, we propose a two-stage classification model that implements automatic learning to discover rules and to select rules. In the first stage of learning, statistically significant classification association rules are derived through association rule mining. Further, in the second stage of learning, we employ a machine learning-based algorithm which automatically learns the weights of the rules for classification during inference. We use the p-value obtained from Fisher’s exact test to determine the statistical significance of rules. The machine learning-based classifiers like Neural Network, SVM and rule-based classifiers like RIPPER help in classifying the rules automatically in the second stage of learning, instead of forcing the use of a specific heuristic for the same. The rules obtained from the first stage form meaningful features to be used in the second stage of learning. Our approach, BiLevCSS (Bi-Level Classification using Statistically Significant Rules) outperforms various state-of-the-art classifiers in terms of classification accuracy.

[1]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[2]  Martin T. Hagan,et al.  Neural network design , 1995 .

[3]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[4]  Osmar R. Zaïane,et al.  Associative Classification with Statistically Significant Positive and Negative Rules , 2015, CIKM.

[5]  Ian H. Witten,et al.  WEKA: a machine learning workbench , 1994, Proceedings of ANZIIS '94 - Australian New Zealnd Intelligent Information Systems Conference.

[6]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[7]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[8]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[9]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Osmar R. Zaïane,et al.  Text document categorization by term association , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[12]  Steven L. Salzberg,et al.  Book Review: C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993 , 1994, Machine Learning.

[13]  Osmar R. Zaïane,et al.  Exploiting statistically significant dependent rules for associative classification , 2017, Intell. Data Anal..

[14]  Osmar R. Zaïane,et al.  An associative classifier based on positive and negative rules , 2004, DMKD '04.

[15]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[16]  Roberto J. Bayardo Brute-Force Mining of High-Confidence Classification Rules , 1997, KDD.

[17]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[18]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[19]  Frans Coenen,et al.  An evaluation of approaches to classification rule selection , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[20]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[21]  Osmar R. Zaïane,et al.  Learning to Use a Learned Model: A Two-Stage Approach to Classification , 2006, Sixth International Conference on Data Mining (ICDM'06).