Associative classification with a new condenseness measure

Associative classification (AC) is a branch of data mining that utilizes association rules (ARs) for classification. ARs are extracted from databases that satisfy some statistical criteria such as minimal support. However, in some practical applications, useful ARs may be found among infrequent, but closely related, itemsets that are filtered out by high minimal support. In this study, a new measure, named “condenseness,” is presented for evaluating whether infrequent ruleitems that are filtered out by minimal support can form strong ARs for classification. For an infrequent ruleitem, the condenseness is the average of lift of all ARs that can be generated from the ruleitem. A ruleitem with a high condenseness means that its elements are closely related and can serve for AC even if it does not have high support. Based on the concept of condenseness, a new associative classifier is developed and presented – condensed association rules for classification (CARC). CARC generates ARs using a modified Apriori algorithm and develops new strategies of rule inference. With the condenseness measure and strategies for rule inference, more useful ARs can be produced and improve the effectiveness of association classification. Empirical evidences show that CARC mitigates the problems caused by setting too high/low minimal support and has a better performance on classification.

[1]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[2]  Luc De Raedt,et al.  CorClass: Correlated Association Rule Mining for Classification , 2004, Discovery Science.

[3]  José Francisco Martínez Trinidad,et al.  Mining frequent patterns and association rules using similarities , 2013, Expert Syst. Appl..

[4]  Chien-Sing Lee,et al.  Processing online analytics with classification and association rule mining , 2010, Knowl. Based Syst..

[5]  Fei Wang,et al.  Supervised patient similarity measure of heterogeneous patient records , 2012, SKDD.

[6]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[7]  Jinyan Li,et al.  CAEP: Classification by Aggregating Emerging Patterns , 1999, Discovery Science.

[8]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[9]  Chih-Yang Chen,et al.  Improving the performance of association classifiers by rule prioritization , 2012, Knowl. Based Syst..

[10]  Tharam S. Dillon,et al.  Interestingness measures for association rules based on statistical validity , 2011, Knowl. Based Syst..

[11]  Roque Marín,et al.  A Theory of Evidence-based method for assessing frequent patterns , 2013, Expert Syst. Appl..

[12]  Xing Zhang,et al.  A new approach to classification based on association rule mining , 2006, Decis. Support Syst..

[13]  Rich Caruana,et al.  Data mining in metric space: an empirical analysis of supervised learning performance criteria , 2004, ROCAI.

[14]  Mohammed J. Zaki,et al.  Calibrated Lazy Associative Classification , 2008, SBBD.

[15]  Fadi A. Thabtah,et al.  A review of associative classification mining , 2007, The Knowledge Engineering Review.

[16]  Bing Liu,et al.  Classification Using Association Rules: Weaknesses and Enhancements , 2001 .

[17]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[18]  Gwo-Jen Hwang,et al.  A minimal perfect hashing scheme to mining association rules from frequently updated data , 2006 .

[19]  Tzung-Pei Hong,et al.  Classification based on association rules: A lattice-based approach , 2012, Expert Syst. Appl..

[20]  Wen-Chin Chen,et al.  Adjusting and generalizing CBA algorithm to handling class imbalance , 2012, Expert Syst. Appl..

[21]  Bostjan Likar,et al.  A review of 3D/2D registration methods for image-guided interventions , 2012, Medical Image Anal..

[22]  Chia Chuen Kao,et al.  Ocean remotely sensed image analysis using two-dimensional continuous wavelet transforms , 2011 .

[23]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[24]  Om Prakash Vyas,et al.  Using Associative Classifiers for Predictive Analysis in Health Care Data Mining , 2010 .

[25]  Dino Ienco,et al.  LODE: A distance-based classifier built on ensembles of positive and negative observations , 2012, Pattern Recognit..

[26]  Engelbert Mephu Nguifo,et al.  Looking for a structural characterization of the sparseness measure of (frequent closed) itemset contexts , 2013, Inf. Sci..

[27]  Zailani Abdullah,et al.  Mining highly correlated least association rules using scalable trie-based algorithm , 2012 .

[28]  Peter I. Cowling,et al.  A greedy classification algorithm based on association rule , 2007, Appl. Soft Comput..

[29]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.