Interestingness of association rules in data mining: Issues relevant to e-commerce

The ubiquitous low-cost connectivity synonymous with the internet has changed the competitive business environment by dissolving traditional sources of competitive advantage based on size, location and the like. In this level playing field, firms are forced to compete on the basis of knowledge. Data mining tools and techniques provide e-commerce applications with novel and significant knowledge. This knowledge can be leveraged to gain competitive advantage. However, the automated nature of data mining algorithms may result in a glut of patterns — the sheer numbers of which contribute to incomprehensibility. Importance of automated methods that address this immensity problem, particularly with respect to practical application of data mining results, cannot be overstated. We first examine different approaches to address this problem citing their applicability to e-commerce whenever appropriate. We then provide a detailed survey of one important approach, namely interestingness measure, and discuss its relevance in e-commerce applications such as personalization in recommender systems. Study of current literature brings out important issues that reveal many promising avenues for future research. We conclude by reiterating the importance of post-processing methods in data mining for effective and efficient deployment of e-commerce solutions.

[1]  Alex Alves Freitas,et al.  On Objective Measures of Rule Surprisingness , 1998, PKDD.

[2]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[3]  Ke Wang,et al.  Interestingness-Based Interval Merger for Numeric Association Rules , 1998, KDD.

[4]  Philip S. Yu,et al.  Mining Associations with the Collective Strength Approach , 2001, IEEE Trans. Knowl. Data Eng..

[5]  Ramayya Krishnan,et al.  E-Business and Management Science: Mutual Impacts (Part 2 of 2) , 2003, Manag. Sci..

[6]  Ron Kohavi,et al.  Integrating e-commerce and data mining: architecture and challenges , 2000, Proceedings 2001 IEEE International Conference on Data Mining.

[7]  Sigal Sahar,et al.  Exploring interestingness through clustering: a framework , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[8]  Gediminas Adomavicius,et al.  Expert-Driven Validation of Rule-Based User Models in Personalization Applications , 2004, Data Mining and Knowledge Discovery.

[9]  Howard J. Hamilton,et al.  Visualizing data mining results with domain generalization graphs , 2001 .

[10]  Christopher M. Bishop,et al.  Classification and regression , 1997 .

[11]  F RoddickJohn,et al.  What's interesting about Cricket? , 2001 .

[12]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[13]  Dan A. Simovici,et al.  Generating an informative cover for association rules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[14]  Narsingh Deo,et al.  Graph Theory with Applications to Engineering and Computer Science , 1975, Networks.

[15]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[16]  Gregory Piatetsky-Shapiro,et al.  Measuring lift quality in database marketing , 2000, SKDD.

[17]  Ramasamy Uthurusamy,et al.  EVOLVING DATA MINING INTO SOLUTIONS FOR INSIGHTS , 2002 .

[18]  Joydeep Ghosh,et al.  Distance based clustering of association rules , 1999 .

[19]  Andreas Rudolph,et al.  Techniques of Cluster Algorithms in Data Mining , 2002, Data Mining and Knowledge Discovery.

[20]  Sumit Sarkar,et al.  The Role of the Management Sciences in Research on Personalization , 2003, Manag. Sci..

[21]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[22]  Gediminas Adomavicius,et al.  Discovery of Actionable Patterns in Databases: the Action Hierarchy Approach , 1997, KDD.

[23]  Ming-Syan Chen,et al.  On the mining of substitution rules for statistically dependent items , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[24]  Sigal Sahar,et al.  Interestingness via what is not interesting , 1999, KDD '99.

[25]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[26]  Rosa Meo Theory of dependence values , 2000, TODS.

[27]  Wynne Hsu,et al.  Finding Interesting Patterns Using User Expectations , 1999, IEEE Trans. Knowl. Data Eng..

[28]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[29]  Ron Kohavi,et al.  Applications of Data Mining to Electronic Commerce , 2000, Springer US.

[30]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[31]  B. Shekar,et al.  A Framework for Evaluating Knowledge-Based Interestingness of Association Rules , 2004, Fuzzy Optim. Decis. Mak..

[32]  Alípio Mário Jorge Hierarchical Clustering for Thematic Browsing and Summarization of Large Sets of Association Rules , 2004, SDM.

[33]  Balaji Padmanabhan,et al.  Unexpectedness as a Measure of Interestingness in Knowledge Discovery , 1999, Decis. Support Syst..

[34]  Edith Schonberg,et al.  Visualization and Analysis of Clickstream Data of Online Stores for Understanding Web Merchandising , 2004, Data Mining and Knowledge Discovery.

[35]  Ramayya Krishnan,et al.  E-Business and Management Science: Mutual Impacts (Part 2 of 2) , 2003, Manag. Sci..

[36]  Hongjun Lu,et al.  Exception Rule Mining with a Relative Interestingness Measure , 2000, PAKDD.

[37]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[38]  John Riedl,et al.  Analysis of recommendation algorithms for e-commerce , 2000, EC '00.

[39]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[40]  Hongjun Lu,et al.  Beyond intratransaction association analysis: mining multidimensional intertransaction association rules , 2000, TOIS.

[41]  A. Ram Knowledge Goals : A Theory of Interestingness Ashwin , 1990 .

[42]  Christos Faloutsos,et al.  Ratio Rules: A New Paradigm for Fast, Quantifiable Data Mining , 1998, VLDB.

[43]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[44]  Ramasamy Uthurusamy,et al.  Evolving data into mining solutions for insights , 2002, CACM.

[45]  Wynne Hsu,et al.  Multi-level organization and summarization of the discovered rules , 2000, KDD '00.

[46]  N.R. Malik,et al.  Graph theory with applications to engineering and computer science , 1975, Proceedings of the IEEE.

[47]  Gregory Piatetsky,et al.  Selecting and Reporting What is Interesting � The KEFIR Application to Healthcare Data , 2004 .

[48]  Heikki Mannila,et al.  Pruning and grouping of discovered association rules , 1995 .

[49]  B. Shekar,et al.  A transaction-based neighbourhood-driven approach to quantifying interestingness of association rules , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[50]  Alex Alves Freitas,et al.  On rule interestingness measures , 1999, Knowl. Based Syst..

[51]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[52]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[53]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[54]  AdomaviciusGediminas,et al.  Expert-Driven Validation of Rule-Based User Models in Personalization Applications , 2001 .

[55]  M. Narasimha Murty,et al.  Knowledge-based association rule mining using AND-OR taxonomies , 2003, Knowl. Based Syst..

[56]  Andrew Whinston,et al.  Frontiers of Electronic Commerce , 1996 .

[57]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 2004, Data Mining and Knowledge Discovery.

[58]  Bart Baesens,et al.  Post-Processing of Association Rules , 2009 .

[59]  John Riedl,et al.  E-Commerce Recommendation Applications , 2004, Data Mining and Knowledge Discovery.

[60]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[61]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[62]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[63]  John F. Roddick,et al.  What's interesting about Cricket?: on thresholds and anticipation in discovered rules , 2001, SKDD.

[64]  Andreas Geyer-Schulz,et al.  Comparing Two Recommender Algorithms with the Help of Recommendations by Peers , 2002, WEBKDD.

[65]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[66]  S. Burt,et al.  E-commerce and the retail process: a review , 2003 .

[67]  Shamkant B. Navathe,et al.  Mining for strong negative associations in a large database of customer transactions , 1998, Proceedings 14th International Conference on Data Engineering.

[68]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[69]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[70]  John A. Major,et al.  Selecting among rules induced from a hurricane database , 1993, Journal of Intelligent Information Systems.