Efficient Tree Based Distributed Data Mining Algorithms for mining Frequent Patterns

Advancements in the field of wired and wireless network environments have paved route to the advent of many dynamic distributed computing environments. These environments have diverged computing resources and multiple heterogeneous sources of data. Most mining algorithms are designed to mine rules from monolithic non-distributed databases. Even algorithms exclusively designed to operate on distributed databases normally download the relevant data to a centralized location and then perform the data mining operations. This centralized approach does not work well in many of the distributed, ubiquitous, privacy sensitive data mining applications, which opened a new area of research Distributed Data Mining (DDM) under the data mining domain. Out of various methods employed to mine frequent Itemsets, tree based methodology proves some efficiency in distributed environment. So in this paper we study a set of tree based algorithms [DTFIM, PP, LFP and PP] to mine frequent pattern in distributed environment.

[1]  Edward Y. Chang,et al.  Pfp: parallel fp-growth for query recommendation , 2008, RecSys '08.

[2]  Jian Hu,et al.  A Fast Parallel Association Rules Mining Algorithm Based on FP-Forest , 2008, ISNN.

[3]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[4]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[5]  Marco Furini,et al.  International Journal of Computer and Applications , 2010 .

[6]  Byeong-Soo Jeong,et al.  Parallel and Distributed Algorithms for Frequent Pattern Mining in Large Databases , 2009 .

[7]  David A. Padua,et al.  A sampling-based framework for parallel data mining , 2005, PPoPP.

[8]  Wenguang Chen,et al.  Tree partition based parallel frequent pattern mining on shared memory systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[9]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[10]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[11]  Ke Wang,et al.  Mining frequent item sets by opportunistic projection , 2002, KDD.

[12]  Jiayi Zhou,et al.  Load Balancing Approach Parallel Algorithm for Frequent Pattern Mining , 2007, PaCT.

[13]  Osmar R. Zaïane,et al.  Fast parallel association rule mining without candidacy generation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[14]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[15]  Mohamed E. El-Sharkawi,et al.  Vertical Mining of Frequent Patterns from Uncertain Data , 2010, Comput. Inf. Sci..

[16]  Morteza Keshtkaran,et al.  DTFIM: Distributed Trie-based Frequent Itemset Mining , 2008 .

[17]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[18]  Amedeo Napoli,et al.  An Efficient Hybrid Algorithm for Mining Frequent Closures and Generators , 2008 .

[19]  Hongjun Lu,et al.  H-mine: hyper-structure mining of frequent patterns in large databases , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[20]  Heikki Mannila,et al.  Efficient Algorithms for Discovering Association Rules , 1994, KDD Workshop.

[21]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[22]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[23]  Shirish Tatikonda,et al.  Toward terabyte pattern mining: an architecture-conscious solution , 2007, PPoPP.

[24]  G. Grahne,et al.  High Performance Mining of Maximal Frequent Itemsets Gösta , 2003 .

[25]  Ashfaq Khokhar,et al.  Frequent Pattern Mining on Message Passing Multiprocessor Systems , 2004, Distributed and Parallel Databases.

[26]  Minho Kim,et al.  A Virtual Join Algorithm for Fast Association Rule Mining , 2003, IDEAL.

[27]  Fabrizio Silvestri,et al.  An Efficient Parallel and Distributed Algorithm for Counting Frequent Sets , 2002, VECPAR.

[28]  David Taniar,et al.  ODAM: An optimized distributed association rule mining algorithm , 2004, IEEE Distributed Systems Online.

[29]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.