Implementing BDFS(b) with diff-sets for real-time frequent pattern mining in dense datasets - first findings

Finding frequent patterns from databases has been the most researched topic in association-rule mining. Business-intelligence using data mining has felt an increased thrust for real-time frequent pattern mining algorithms finding huge demand from numerous real-time business applications like e-commerce, recommender-systems, group-decision-support-systems, supply-chain-management, to name a few. Last decade has seen development of mind-whelming algorithms, among which, vertical-mining algorithms have been found to be very effective. However, with dense-datasets, the performances of these algorithms significantly degrade. Moreover, these algorithms are not suited to respond to the real-time need. In this paper, we describe BDFS(b)-diff-sets, an algorithm to perform real-time frequent pattern mining using diff-sets and using an intelligent staged search technique, by-passing usual breadth-first and depth-first search-techniques. Empirical evaluations show that our algorithm can make a fair estimation of the probable frequent-patterns reacting to the user-defined time bound and reaches some of the longest frequent patterns much faster than the existing algorithms.

[1]  Steve Rabin The Real-Time Enterprise, the Real-Time Supply Chain , 2003, Inf. Syst. Manag..

[2]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[3]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[4]  Bart Goethals,et al.  Survey on Frequent Pattern Mining , 2003 .

[5]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[6]  Andrew B. Whinston,et al.  Distributed decision support systems for real-time supply chain management using agent technologies , 1997 .

[7]  Weiyang Lin Association Rule Mining for Collaborative Recommender Systems , 2000 .

[8]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[9]  Nils J. Nilsson,et al.  Artificial Intelligence: A New Synthesis , 1997 .

[10]  John Riedl,et al.  E-Commerce Recommendation Applications , 2004, Data Mining and Knowledge Discovery.

[11]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[12]  Salvatore J. Stolfo,et al.  Real time data mining-based intrusion detection , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.

[13]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD '00.

[14]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[15]  Vladimir Kotlyar,et al.  Personalization of Supermarket Product Recommendations , 2004, Data Mining and Knowledge Discovery.

[16]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[17]  Bart Goethals,et al.  Memory issues in frequent itemset mining , 2004, SAC '04.

[18]  Robert L. Grossman,et al.  Data Mining for Scientific and Engineering Applications , 2001, Massive Computing.

[19]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[20]  David C. Yen,et al.  Data mining techniques for customer relationship management , 2002 .

[21]  Ambuj Mahanti,et al.  An Efficient Technique for Frequent Pattern Mining in Real-Time Business Applications , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[22]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[23]  Hongjun Lu,et al.  Mining the Customer's Up-To-Moment Preferences for E-commerce Recommendation , 2003, PAKDD.

[24]  Asim K. Pal,et al.  A high-performance limited-memory admissible and real time search algorithm for networks , 1992 .

[25]  A. K. Pujari,et al.  Data Mining Techniques , 2006 .

[26]  Wen-Yang Lin,et al.  CBW: an efficient algorithm for frequent itemset mining , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.