Analysis of Various Interestingness Measures in Class Association Rule Mining

Many measures have been developed to determine the interestingness of rules in data mining. Numerous studies have shown that the effects of different measures depend on the concrete problems, and different measures usually provide different and conflicting results. Therefore, selecting the appropriate measure becomes an important issue in data mining. In this paper, a novel approach to select the appropriate measure for class association rule mining is proposed. The proposed approach is applied to several problems, including benchmark and real-world datasets. The experimental results show that the proposed approach is a powerful tool to analyze various measures to select the right ones for the concrete problems, leading to the increase of the classification accuracy. Based on the study, this paper further proposes four properties of interestingness measures that should be considered in class association rule mining.

[1]  Roberto J. Bayardo,et al.  Mining the most interesting rules , 1999, KDD '99.

[2]  Padhraic Smyth,et al.  Rule Induction Using Information Theory , 1991, Knowledge Discovery in Databases.

[3]  Robert J. Hilderman Knowledge Discovery and Interestingness Measures: A Survey , 2012 .

[4]  Edward H. Shortliffe,et al.  A model of inexact reasoning in medicine , 1990 .

[5]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[6]  Rajeev Motwani,et al.  Beyond Market Baskets: Generalizing Association Rules to Dependence Rules , 1998, Data Mining and Knowledge Discovery.

[7]  K. Hirasawa,et al.  Mining Fuzzy Association Rules: A General Model Based on Genetic Network Programming and its Applications , 2010 .

[8]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[9]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[10]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[11]  Kotaro Hirasawa,et al.  Class Association Rule Mining with Chi-Squared Test Using Genetic Network Programming , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[12]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[13]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[14]  Patrick Meyer,et al.  On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid , 2008, Eur. J. Oper. Res..

[15]  Takahira Yamaguchi,et al.  Evaluation of Rule Interestingness Measures with a Clinical Dataset on Hepatitis , 2004, PKDD.

[16]  Pang-Ning Tan,et al.  Interestingness Measures for Association Patterns : A Perspective , 2000, KDD 2000.

[17]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[18]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[19]  Kotaro Hirasawa,et al.  Genetic Network Programming with Acquisition Mechanisms of Association Rules , 2006, J. Adv. Comput. Intell. Intell. Informatics.

[20]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[21]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[22]  Edward Omiecinski,et al.  Alternative Interest Measures for Mining Associations in Databases , 2003, IEEE Trans. Knowl. Data Eng..

[23]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[24]  Shingo Mabu,et al.  Time Related Class Association Rule Mining and Its Application to Traffic Prediction , 2010 .

[25]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[26]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[27]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[28]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[29]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[30]  I. Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods. , 1967 .

[31]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[32]  Kenneth McGarry,et al.  A survey of interestingness measures for knowledge discovery , 2005, The Knowledge Engineering Review.

[33]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[34]  Tao Zhang,et al.  Association Rules , 2000, PAKDD.

[35]  Shingo Mabu,et al.  Genetic Network Programming with Estimation of Distribution Algorithms for class association rule mining in traffic prediction , 2010, IEEE Congress on Evolutionary Computation.