Two Measures of Objective Novelty in Association Rule Mining

Association rule mining is well-known to depend heavily on a support threshold parameter, and on one or more thresholds for intensity of implication; among these measures, confidence is most often used and, sometimes, related alternatives such as lift, leverage, improvement, or all-confidence are employed, either separately or jointly with confidence. We remain within the support-and-confidence framework in an attempt at studying complementary notions, which have the goal of measuring relative forms of objective novelty or surprisingness of each individual rule with respect to other rules that hold in the same dataset. We measure novelty through the extent to which the confidence value is robust, taken relative to the confidences of related (for instance, logically stronger) rules, as opposed to the absolute consideration of the single rule at hand. We consider two variants of this idea and analyze their logical and algorithmic properties. Since this approach has the drawback of requiring further parameters, we also propose a framework in which the user sets a single parameter, of quite clear intuitive semantics, from which the corresponding thresholds for confidence and novelty are computed.

[1]  Sadaaki Miyamoto,et al.  Rough Sets and Current Trends in Computing , 2012, Lecture Notes in Computer Science.

[2]  Bart Selman,et al.  Horn Approximations of Empirical Data , 1995, Artif. Intell..

[3]  Jörg Endrullis,et al.  Transforming Outermost into Context-Sensitive Rewriting , 2010, Log. Methods Comput. Sci..

[4]  Viet Phan-Luong The representative basis for association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[5]  C. S. Kanimozhi Selvi,et al.  Association Rule Mining with Dynamic Adaptive Support Thresholds for Associative Classification , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[6]  Mohammed J. Zaki,et al.  Theoretical Foundations of Association Rules , 2007 .

[7]  Xindong Wu,et al.  Research and Development in Knowledge Discovery and Data Mining , 1998, Lecture Notes in Computer Science.

[8]  Petr Hájek,et al.  Formal logics of discovery and hypothesis formation by machine , 1998, Theor. Comput. Sci..

[9]  Wynne Hsu,et al.  Pruning and summarizing the discovered associations , 1999, KDD '99.

[10]  Gemma C. Garriga Statistical Strategies for Pruning All the Uninteresting Association Rules , 2004, ECAI.

[11]  Edward Omiecinski,et al.  Alternative Interest Measures for Mining Associations in Databases , 2003, IEEE Trans. Knowl. Data Eng..

[12]  T. N. Janakiraman,et al.  Image Segmentation Based on Minimal Spanning Tree and Cycles , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[13]  Francesco Bonchi,et al.  Knowledge Discovery in Inductive Databases, 4th International Workshop, KDID 2005, Porto, Portugal, October 3, 2005, Revised Selected and Invited Papers , 2006, KDID.

[14]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 2004, Data Mining and Knowledge Discovery.

[15]  Frans Coenen,et al.  Threshold Tuning for Improved Classification Association Rule Mining , 2005, PAKDD.

[16]  L. Beran,et al.  [Formal concept analysis]. , 1996, Casopis lekaru ceskych.

[17]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[18]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[19]  Jean-François Boulicaut,et al.  A Survey on Condensed Representations for Frequent Sets , 2004, Constraint-Based Mining and Inductive Databases.

[20]  Marzena Kryszkiewicz,et al.  Fast Discovery of Representative Association Rules , 1998, Rough Sets and Current Trends in Computing.

[21]  Vincent Duquenne,et al.  Familles minimales d'implications informatives résultant d'un tableau de données binaires , 1986 .

[22]  Marzena Kryszkiewicz,et al.  Representative Association Rules , 1998, PAKDD.

[23]  Nimrod Megiddo,et al.  Discovering Predictive Association Rules , 1998, KDD.

[24]  Jan Komorowski,et al.  Principles of Data Mining and Knowledge Discovery , 2001, Lecture Notes in Computer Science.

[25]  Fabrice Guillet,et al.  Improving the Discovery of Association Rules with Intensity of Implication , 1998, PKDD.

[26]  Christian Borgelt,et al.  EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .

[27]  M. Wild A Theory of Finite Closure Spaces Based on Implications , 1994 .

[28]  John F. Roddick,et al.  Association mining , 2006, CSUR.

[29]  Jean-François Boulicaut,et al.  Mining Formal Concepts with a Bounded Number of Exceptions from Transactional Data , 2004, KDID.

[30]  Mohammed J. Zaki Mining Non-Redundant Association Rules , 2004, Data Min. Knowl. Discov..

[31]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[32]  Marzena Kryszkiewicz,et al.  Closed Set Based Discovery of Representative Association Rules , 2001, IDA.

[33]  Elena Baralis,et al.  On support thresholds in associative classification , 2004, SAC '04.

[34]  Gerd Stumme,et al.  Generating a Condensed Representation for Association Rules , 2005, Journal of Intelligent Information Systems.

[35]  David J. Hand,et al.  Advances in intelligent data analysis , 2000 .

[36]  Luc De Raedt,et al.  Constraint-Based Mining and Inductive Databases, European Workshop on Inductive Databases and Constraint Based Mining, Hinterzarten, Germany, March 11-13, 2004, Revised Selected Papers , 2005, Constraint-Based Mining and Inductive Databases.

[37]  Hiroyuki Kawano,et al.  Mining association algorithm with threshold based on ROC analysis , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[38]  Gemma Casas-Garriga,et al.  Statistical strategies for Pruning all the uninteresting association rules , 2004 .

[39]  Viet Phan Luong The Representative Basis for Association Rules , 2001, ICDM.

[40]  Bruno Crémilleux,et al.  A Unified View of Objective Interestingness Measures , 2007, MLDM.

[41]  Hui Xiong,et al.  Mining strong affinity association patterns in data sets with skewed support distribution , 2003, Third IEEE International Conference on Data Mining.

[42]  Jean-François Boulicaut,et al.  Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries , 2004, Data Mining and Knowledge Discovery.

[43]  Michael Luxenburger,et al.  Implications partielles dans un contexte , 1991 .

[44]  José L. Balcázar,et al.  Transforming Outermost into Context-Sensitive Rewriting , 2010, Log. Methods Comput. Sci..

[45]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[46]  Dan Roth,et al.  Reasoning with Models , 1994, Artif. Intell..

[47]  Joseph L. Hellerstein,et al.  Mining mutually dependent patterns for system management , 2002, IEEE J. Sel. Areas Commun..

[48]  Philip S. Yu,et al.  A New Approach to Online Generation of Association Rules , 2001, IEEE Trans. Knowl. Data Eng..