Association Rule Interestingness Measures: Experimental and Theoretical Studies

It is a common problem that Kdd processes may generate a large number of patterns depending on the algorithm used, and its parameters. It is hence impossible for an expert to assess these patterns. This is the case with the well-known Apriori algorithm. One of the methods used to cope with such an amount of output depends on using association rule interestingness measures. Stating that selecting interesting rules also means using an adapted measure, we present a formal and an experimental study of 20 measures. The experimental studies carried out on 10 data sets lead to an experimental classification of the measures. This study is compared to an analysis of the formal and meaningful properties of the measures. Finally, the properties are used in a multi-criteria decision analysis in order to select amongst the available measures the one or those that best take into account the user’s needs. These approaches seem to be complementary and could be useful in solving the problem of a user’s choice of measure.

[1]  Michael Greenacre,et al.  Exploratory data analysis leading towards the most interesting simple association rules , 2008, Comput. Stat. Data Anal..

[2]  Howard J. Hamilton,et al.  Evaluation of Interestingness Measures for Ranking Discovered Knowledge , 2001, PAKDD.

[3]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[4]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[5]  Ke Wang,et al.  Visually Aided Exploration of Interesting Association Rules , 1999, PAKDD.

[6]  Olivier Teytaud,et al.  Contrôle du risque multiple pour la sélection de règles d'association significatives , 2004, EGC.

[7]  Yves Kodratoff,et al.  Evaluation de la résistance au bruit de quelques mesures d'extraction de règles d'association , 2002, EGC.

[8]  Andreas Wierse,et al.  Information Visualization in Data Mining and Knowledge Discovery , 2001 .

[9]  Abraham Silberschatz,et al.  On Subjective Measures of Interestingness in Knowledge Discovery , 1995, KDD.

[10]  Alexander Tuzhilin,et al.  User-Assisted Knowledge Discovery: How Much Should the User Be Involved , 1996 .

[11]  Shusaku Tsumoto,et al.  Evaluating Model Construction Methods with Objective Rule Evaluation Indices to Support Human Experts , 2006, MDAI.

[12]  Einoshin Suzuki,et al.  Discovering Interesting Exception Rules with Rule Pair , 2004 .

[13]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[14]  Hongjun Lu,et al.  Exception Rule Mining with a Relative Interestingness Measure , 2000, PAKDD.

[15]  J. Loevinger A systematic approach to the construction and evaluation of tests of ability. , 1947 .

[16]  Mohamed Bendou Extraction de connaissances à partir des données à l'aide des réseaux bayésiens , 2003 .

[17]  Szymon Jaroszewicz,et al.  A General Measure of Rule Interestingness , 2001, PKDD.

[18]  Xuan-Hiep Huynh,et al.  ARQAT : an exploratory analysis tool for interestingness measures , 2005 .

[19]  Patrick Meyer,et al.  Aide multicritère à la décision pour évaluer les indices de qualité des connaissances , 2003, EGC.

[20]  Régis Gras,et al.  Using information-theoretic measures to assess association rule interestingness , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[21]  Bertrand Mareschal,et al.  Prométhée-Gaia: une méthodologie d'aide à la décision en présence de critères multiples , 2002 .

[22]  Howard J. Hamilton,et al.  Measuring the interestingness of discovered knowledge: A principled approach , 2003, Intell. Data Anal..

[23]  B. Padmanabhan The Interestingness Paradox in Pattern Discovery , 2004 .

[24]  Wynne Hsu,et al.  Analyzing the Subjective Interestingness of Association Rules , 2000, IEEE Intell. Syst..

[25]  K. Pearson Mathematical Contributions to the Theory of Evolution. III. Regression, Heredity, and Panmixia , 1896 .

[26]  Daniel A. Keim,et al.  Information Visualization and Visual Data Mining , 2002, IEEE Trans. Vis. Comput. Graph..

[27]  V. Giakoumakis,et al.  Coefficients d'accord entre deux préordres totaux , 1987 .

[28]  François Poulet,et al.  Towards Visual Data Mining , 2004, ICEIS.

[29]  K. Pearson Mathematical contributions to the theory of evolution.—On the law of reversion , 2022, Proceedings of the Royal Society of London.

[30]  Wynne Hsu,et al.  Using General Impressions to Analyze Discovered Classification Rules , 1997, KDD.

[31]  Régis Gras,et al.  L'implication statistique : nouvelle méthode exploratoire de données : applications à la didactique , 1996 .

[32]  Michael Greenacre,et al.  Visualization of Categorical Data , 1998 .

[33]  Philippe Lenca,et al.  A Clustering of Interestingness Measures , 2004, Discovery Science.

[34]  Yoshinori Sato,et al.  Comparison between objective interestingness measures and real human interest in medical data mining , 2004 .

[35]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[36]  Peter A. Flach,et al.  Ninth International Workshop on Inductive Logic Programming (ILP'99) , 1999 .

[37]  Philippe Lenca,et al.  Critères d'évaluation des mesures de qualité des règles d'association , 2003 .

[38]  Philippe Lenca Human centered processes , 2002, Eur. J. Oper. Res..

[39]  Jan Rauch,et al.  Mining for 4ft Association Rules , 2000, Discovery Science.

[40]  Gilbert Saporta,et al.  Une comparaison de certains indices de pertinence des règles d'association , 2006, EGC.

[41]  Raymond Bisdorff Bipolar ranking from pairwise fuzzy outrankings , 1999 .

[42]  Simeon J. Simoff,et al.  Towards the development of environments for designing visualisation support for visual data mining. , 2001 .

[43]  Martin Theus,et al.  Visualization of categorical data , 1997 .

[44]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[45]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[46]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[47]  Luc De Raedt,et al.  CorClass: Correlated Association Rule Mining for Classification , 2004, Discovery Science.

[48]  Fabrice Guillet,et al.  A virtual Reality Environment for Knowledge Mining , 2003 .

[49]  Hui Xiong,et al.  Mining strong affinity association patterns in data sets with skewed support distribution , 2003, Third IEEE International Conference on Data Mining.

[50]  Howard J. Hamilton,et al.  Knowledge discovery and measures of interest , 2001 .

[51]  Peter A. Flach,et al.  Rule Evaluation Measures: A Unifying View , 1999, ILP.

[52]  Rajjan Shinghal,et al.  Evaluating the Interestingness of Characteristic Rules , 1996, KDD.

[53]  Susan Jones,et al.  LEGOL 2.0: A relational specification language for complex rules , 1979, Inf. Syst..

[54]  Alex Alves Freitas,et al.  On rule interestingness measures , 1999, Knowl. Based Syst..

[55]  Edith Cohen,et al.  Finding interesting associations without support pruning , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[56]  Philippe Lenca,et al.  Dynamic adaptation of rules bases under cognitive constraints , 2002, Eur. J. Oper. Res..

[57]  P. Vincke,et al.  Note-A Preference Ranking Organisation Method: The PROMETHEE Method for Multiple Criteria Decision-Making , 1985 .

[58]  Kenneth McGarry,et al.  A survey of interestingness measures for knowledge discovery , 2005, The Knowledge Engineering Review.

[59]  Frank Höppner,et al.  Association Rules , 2005, Data Mining and Knowledge Discovery Handbook.

[60]  Petr Hájek,et al.  The GUHA method of automatic hypotheses determination , 1966, Computing.

[61]  Pang-Ning Tan,et al.  Interestingness Measures for Association Patterns : A Perspective , 2000, KDD 2000.

[62]  Irving John Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .

[63]  H. Jeffreys Some Tests of Significance, Treated by the Theory of Probability , 1935, Mathematical Proceedings of the Cambridge Philosophical Society.

[64]  Régis Gras,et al.  Une version entropique de l'intensité d'implication pour les corpus volumineux , 2001, EGC.

[65]  Howard J. Hamilton,et al.  Applying Objective Interestingness Measures in Data Mining Systems , 2000, PKDD.

[66]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[67]  Hui Xiong,et al.  Mining confident co-location rules without a support threshold , 2003, SAC '03.

[68]  Régis Gras,et al.  Élaboration et évaluation d'un indice d'implication pour des données binaires. I , 1981 .

[69]  Jean Pierre Brans,et al.  A PREFERENCE RANKING ORGANIZATION METHOD , 1985 .

[70]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[71]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[72]  Einoshin Suzuki,et al.  In Pursuit of Interesting Patterns with Undirected Discovery of Exception Rules , 2002, Progress in Discovery Science.

[73]  Zdzisław Pawlak,et al.  Can Bayesian confirmation measures be useful for rough set decision rules? , 2004, Eng. Appl. Artif. Intell..

[74]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[75]  Régis Gras,et al.  Assessing rule interestingness with a probabilistic measure of deviation from equilibrium , 2005 .

[76]  Patrick Meyer,et al.  Sorting multi-attribute alternatives: The TOMASO method , 2005, Comput. Oper. Res..

[77]  Jean-Hugues Chauchat,et al.  Bertin's Graphics and Multidimensional Data Analysis , 1998 .

[78]  Régis Gras,et al.  Une version discriminante de l'indice probabiliste d'ècart à l'èquilibre pour mesure la qualité des règles , 2005 .

[79]  Sylvie Helene Guillaume,et al.  Traitement des donnees volumineuses. Mesures et algorithmes d'extraction de regles d'association et regles ordinales , 2000 .

[80]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[81]  Philippe Lenca,et al.  Aggregation of Valued Relations Applied to Association Rule Interestingness Measures , 2006, MDAI.

[82]  P. Ribenboim,et al.  Collected Papers, Volume 1+2 , 1999 .

[83]  Christian Borgelt,et al.  Induction of Association Rules: Apriori Implementation , 2002, COMPSTAT.

[84]  O. Teytaud,et al.  Évaluation et validation de l'intérêt des règles d'association , 2003 .

[85]  Geert Wets,et al.  Defining interestingness for association rules , 2003 .

[86]  P. Lenca,et al.  On the robustness of association rules , 2006, 2006 IEEE Conference on Cybernetics and Intelligent Systems.

[87]  Roberto J. Bayardo,et al.  Mining the most interesting rules , 1999, KDD '99.

[88]  Jérôme Azé,et al.  Une mesure probabiliste contextuelle discriminante de qualité des règles d'association , 2003, EGC.

[89]  A. W. Edwards The Measure of Association in a 2 × 2 Table , 1963 .