PARAS: A Parameter Space Framework for Online Association Mining

Association rule mining is known to be computationally intensive, yet real-time decision-making applications are increasingly intolerant to delays. In this paper, we introduce the parameter space model, called PARAS. PARAS enables efficient rule mining by compactly maintaining the final rulesets. The PARAS model is based on the notion of stable region abstractions that form the coarse granularity ruleset space. Based on new insights on the redundancy relationships among rules, PARAS establishes a surprisingly compact representation of complex redundancy relationships while enabling efficient redundancy resolution at query-time. Besides the classical rule mining requests, the PARAS model supports three novel classes of exploratory queries. Using the proposed PSpace index, these exploratory query classes can all be answered with near real-time responsiveness. Our experimental evaluation using several benchmark datasets demonstrates that PARAS achieves 2 to 5 orders of magnitude improvement over state-of-the-art approaches in online association rule mining.

[1]  Laks V. S. Lakshmanan,et al.  Exploratory mining via constrained frequent set queries , 1999, SIGMOD '99.

[2]  Philip S. Yu,et al.  A New Approach to Online Generation of Association Rules , 2001, IEEE Trans. Knowl. Data Eng..

[3]  Shivnath Babu,et al.  Tuning Database Configuration Parameters with iTuned , 2009, Proc. VLDB Endow..

[4]  Jiawei Han,et al.  Association Mining in Large Databases: A Re-examination of Its Measures , 2007, PKDD.

[5]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[6]  Sigal Sahar Interestingness preprocessing , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[7]  Reda Alhajj,et al.  Online mining of fuzzy multidimensional weighted association rules , 2008, Applied Intelligence.

[8]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[9]  David J. DeWitt,et al.  Using a knowledge cache for interactive discovery of association rules , 1999, KDD '99.

[10]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[11]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[12]  Fabrizio Silvestri,et al.  WebDocs: a real-life huge transactional dataset , 2004, FIMI.

[13]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[14]  Vijay V. Raghavan,et al.  Itemset Trees for Targeted Association Querying , 2003, IEEE Trans. Knowl. Data Eng..

[15]  Surajit Chaudhuri,et al.  Variance aware optimization of parameterized queries , 2010, SIGMOD Conference.