Mining Frequent Item Sets Data Streams using "ÉclatAlgorithm"

Frequent pattern mining is the process of mining data in a set of items or some patterns from a largedatabase. The resulted frequent set data supports the minimum support threshold. A frequentpattern is a pattern that occurs frequently in a dataset. Association rule mining is defined as to findout association rules that satisfy the predefined minimum support and confidence from a given database. If an item set is said to be frequent, that item set supports the minimum support andconfidence. A Frequent item set should appear in all the transaction of that data base. Discoveringfrequent item sets play a very important role in mining association rules, sequence rules, web logmining and many other interesting patterns among complex data. Data stream is a real timecontinuous, ordered sequence of items. It is an uninterrupted flow of a long sequence of data. Somereal time examples of data stream data are sensor network data, telecommunication data,transactional data and scientific surveillances systems. These data produced trillions of updatesevery day. So it is very difficult to store the entire data. In that time some mining process is required.Data mining is the non-trivial process of identifying valid, original, potentially useful and ultimatelyunderstandable patterns in data. It is an extraction of the hidden predictive information from largedata base. There are lots of algorithms used to find out the frequent item set. In that Apriorialgorithm is the very first classical algorithm used to find the frequent item set. Apart from Apriori,lots of algorithms generated but they are similar to Apriori. They are based on prune and candidategeneration. It takes more memory and time to find out the frequent item set. In this paper, we havestudied about how the eclat algorithm is used in data streams to find out the frequent item sets.Eclat algorithm need not required candidate generation.

[1]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[2]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[3]  Young-Koo Lee,et al.  CP-Tree: A Tree Structure for Single-Pass Frequent Pattern Mining , 2008, PAKDD.

[4]  Dennis P. Groth,et al.  Average-Case Performance of the Apriori Algorithm , 2004, SIAM J. Comput..

[5]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[6]  Ron Kohavi,et al.  Real world performance of association rule algorithms , 2001, KDD '01.

[7]  Fabrizio Silvestri,et al.  Adaptive and resource-aware mining of frequent sets , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[8]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[9]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD '00.

[10]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[11]  Pauray S. M. Tsai,et al.  Mining frequent itemsets in data streams using the weighted sliding window model , 2009, Expert Syst. Appl..

[12]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[13]  Charu C. Aggarwal,et al.  Data Streams: An Overview and Scientific Applications , 2010, Scientific Data Mining and Knowledge Discovery.

[14]  Younghee Kim,et al.  Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams , 2010, J. Inf. Process. Syst..