CAR-NF: A classifier based on specific rules with high netconf

In this paper, an accurate classifier based on Class Association Rules CARs, called CAR-NF, is proposed. CAR-NF introduces a new strategy for computing CARs, using the Netconf as measure of interest, that allows to prune the CAR search space for building specific rules with high Netconf. Moreover, we propose and prove a proposition that supports the use of a Netconf threshold value equal to 0.5 for mining the CARs. Additionally, a new way for ordering the set of CARs based on their rule sizes and Netconf values is introduced in CAR-NF. The ordering strategy together with the "Best K rules" satisfaction mechanism allows CAR-NF to have better accuracy than CBA, CMAR, CPAR, TFPC and HARMONY classifiers, the best classifiers based on CARs reported in the literature.

[1]  Vipin Kumar,et al.  Generalizing the notion of confidence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[2]  Philip S. Yu,et al.  Direct Discriminative Pattern Mining for Effective Classification , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[3]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[4]  Soon Myoung Chung,et al.  Multipass Algorithms for Mining Association Rules in Text Databases , 2001, Knowledge and Information Systems.

[5]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[6]  MotwaniRajeev,et al.  Beyond market baskets , 1997 .

[7]  Frans Coenen,et al.  Threshold Tuning for Improved Classification Association Rule Mining , 2005, PAKDD.

[8]  Gary Geunbae Lee,et al.  Practical Application of Associative Classifier for Document Classification , 2005, AIRS.

[9]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[10]  Jianyong Wang,et al.  HARMONY: Efficiently Mining the Best Rules for Classification , 2005, SDM.

[11]  Daniel Sánchez,et al.  ART: A Hybrid Classification Model , 2004, Machine Learning.

[12]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[13]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[14]  Anthony K. H. Tung,et al.  Mining top-K covering rule groups for gene expression data , 2005, SIGMOD '05.

[15]  Daniel Sánchez,et al.  Measuring the accuracy and interest of association rules: A new framework , 2002, Intell. Data Anal..

[16]  Frans Coenen,et al.  Hybrid Rule Ordering in Classification Association Rule Mining , 2008, Trans. Mach. Learn. Data Min..

[17]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[18]  Peter I. Cowling,et al.  Knowledge and Information Systems , 2006 .

[19]  Eugenio Cesario,et al.  Boosting text segmentation via progressive classification , 2008, Knowledge and Information Systems.

[20]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[21]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[22]  Peter A. Flach,et al.  Rule Evaluation Measures: A Unifying View , 1999, ILP.

[23]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[24]  Kwang-Il Ahn,et al.  Efficient Mining of Frequent Itemsets and a Measure of Interest for Association Rule Mining , 2004, J. Inf. Knowl. Manag..

[25]  Frans Coenen,et al.  Application of Classification Association Rule Mining for Mammalian Mesenchymal Stem Cell Differentiation , 2009, ICDM.

[26]  Rajeev Motwani,et al.  Beyond Market Baskets: Generalizing Association Rules to Dependence Rules , 1998, Data Mining and Knowledge Discovery.

[27]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[28]  Peter I. Cowling,et al.  MCAR: multi-class classification based on association rule , 2005, The 3rd ACS/IEEE International Conference onComputer Systems and Applications, 2005..

[29]  Frans Coenen,et al.  An evaluation of approaches to classification rule selection , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[30]  Ankur Teredesai,et al.  CoMMA: a framework for integrated multimedia mining using multi-relational associations , 2005, Knowledge and Information Systems.

[31]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[32]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[33]  Frans Coenen,et al.  A Novel Rule Weighting Approach in Classification Association Rule Mining , 2007 .

[34]  José A. Reyes,et al.  Prediction of protein-protein interaction types using association rule based classification , 2009, BMC Bioinformatics.

[35]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[36]  Jianyong Wang,et al.  On Mining Instance-Centric Classification Rules , 2006, IEEE Transactions on Knowledge and Data Engineering.

[37]  José Francisco Martínez Trinidad,et al.  Algorithms for mining frequent itemsets in static and dynamic datasets , 2010, Intell. Data Anal..

[38]  Frans Coenen,et al.  A Novel Rule Ordering Approach in Classification Association Rule Mining , 2007, MLDM.

[39]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[40]  Kamal Ali,et al.  Partial Classification Using Association Rules , 1997, KDD.

[41]  Worapoj Kreesuradej,et al.  A new association rule-based text classifier algorithm , 2005, 17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05).