Frequent Pattern Mining in Data Streams

As the volume of digital commerce and communication has exploded, the demand for data mining of streaming data has likewise grown. One of the fundamental data mining tasks, for both static and streaming data, is frequent pattern mining. The goal of pattern mining is to identity frequently occurring patterns and structures. Such patterns may indicate scientific phenomena, economic or social trends, or even security threats. Moreover, not only is pattern discovery important by itself, but it is also a building block for machine learning tasks such as association rule induction. Traditionally, algorithms for pattern discovery have processed the entire dataset as a batch, with no restriction on how many passes through the data would be taken.

[1]  Suh-Yin Lee,et al.  Online mining of frequent query trees over XML data streams , 2006, WWW '06.

[2]  Carson Kai-Sang Leung,et al.  Frequent itemset mining of uncertain data streams using the damped window model , 2011, SAC.

[3]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[4]  Ricard Gavaldà,et al.  Mining adaptively frequent closed unlabeled rooted trees in data streams , 2008, KDD.

[5]  Wonsuk Lee,et al.  Finding maximal frequent itemsets over online data streams adaptively , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[6]  Arbee L. P. Chen,et al.  Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window , 2005, SDM.

[7]  Hiroki Arimura,et al.  Efficient Algorithms for Finding Frequent Substructures from Semi-structured Data Streams , 2003, JSAI Workshops.

[8]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[9]  Geoff Holmes,et al.  Mining frequent closed graphs on evolving data streams , 2011, KDD.

[10]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[11]  Philip S. Yu,et al.  Moment: maintaining closed frequent itemsets over a stream sliding window , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[12]  Raymond Chi-Wing Wong,et al.  Mining top-K frequent itemsets from data streams , 2006, Data Mining and Knowledge Discovery.

[13]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[14]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[15]  Albert Bifet,et al.  Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams , 2010, Frontiers in Artificial Intelligence and Applications.

[16]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[17]  Young-Koo Lee,et al.  Sliding window-based frequent pattern mining over data streams , 2009, Inf. Sci..

[18]  Yi Lu,et al.  Mining Web Log Sequential Patterns with Position Coded Pre-Order Linked WAP-Tree , 2005, Data Mining and Knowledge Discovery.

[19]  Xindong Wu,et al.  Mining maximal frequent itemsets from data streams , 2007, J. Inf. Sci..

[20]  Yun Chi,et al.  HybridTreeMiner: an efficient algorithm for mining frequent rooted trees and free trees using canonical forms , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[21]  Ruoming Jin,et al.  An algorithm for in-core frequent itemset mining on streaming data , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[22]  Lei Chang,et al.  SeqStream: Mining Closed Sequential Patterns over Stream Sliding Windows , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[23]  Ning Zhang,et al.  A Simple but Effective Maximal Frequent Itemset Mining Algorithm over Streams , 2012, J. Softw..

[24]  Carson Kai-Sang Leung,et al.  Mining of Frequent Itemsets from Streams of Uncertain Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[25]  Heikki Mannila,et al.  Verkamo: Fast Discovery of Association Rules , 1996, KDD 1996.

[26]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[27]  Suh-Yin Lee,et al.  Mining frequent itemsets over data streams using efficient window sliding techniques , 2009, Expert Syst. Appl..

[28]  Christian Hidber,et al.  Association Rule Mining , 2017 .

[29]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[30]  Jiawei Han,et al.  Stream Sequential Pattern Mining with Precise Error Bounds , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[31]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[32]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[33]  Erik D. Demaine,et al.  Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[34]  Philip S. Yu,et al.  A Regression-Based Temporal Pattern Mining Scheme for Data Streams , 2003, VLDB.

[35]  Suh-Yin Lee,et al.  Incremental updates of closed frequent itemsets over continuous data streams , 2009, Expert Syst. Appl..

[36]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[37]  Suh-Yin Lee,et al.  DSM-FI: an efficient algorithm for mining frequent itemsets in data streams , 2008, Knowledge and Information Systems.

[38]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[39]  Jiawei Han,et al.  Mining Compressed Frequent-Pattern Sets , 2005, VLDB.

[40]  Mohammed J. Zaki Efficiently mining frequent trees in a forest , 2002, KDD.

[41]  Srinivasan Parthasarathy,et al.  Parallel Algorithms for Discovery of Association Rules , 1997, Data Mining and Knowledge Discovery.

[42]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[43]  Philip S. Yu,et al.  On dense pattern mining in graph streams , 2010, Proc. VLDB Endow..

[44]  Xuejun Liu,et al.  Mining frequent closed itemsets from a landmark window over online data streams , 2009, Comput. Math. Appl..

[45]  Piotr Indyk,et al.  Comparing Data Streams Using Hamming Norms (How to Zero In) , 2002, VLDB.

[46]  Bin Chen,et al.  A new two-phase sampling based algorithm for discovering association rules , 2002, KDD.

[47]  Nan Jiang,et al.  CFI-Stream: mining closed frequent itemsets in data streams , 2006, KDD '06.

[48]  Suh-Yin Lee,et al.  Online mining (recently) maximal frequent itemsets over data streams , 2005, 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications (RIDE-SDMA'05).

[49]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[50]  Naren Ramakrishnan,et al.  Streaming Algorithms for Pattern Discovery over Dynamically Changing Event Sequences , 2012, ArXiv.

[51]  Hiroki Arimura,et al.  Online algorithms for mining semi-structured data stream , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[52]  Christopher Olston,et al.  Finding (recently) frequent items in distributed data streams , 2005, 21st International Conference on Data Engineering (ICDE'05).

[53]  Ning Zhang,et al.  A False Negative Maximal Frequent Itemset Mining Algorithm over Stream , 2011, ADMA.

[54]  Shanping Li,et al.  GC-Tree: A Fast Online Algorithm for Mining Frequent Closed Itemsets , 2007, PAKDD Workshops.

[55]  Yossi Matias,et al.  New sampling-based summary statistics for improving approximate query answers , 1998, SIGMOD '98.

[56]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[57]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[58]  Won Suk Lee,et al.  Finding recent frequent itemsets adaptively over online data streams , 2003, KDD '03.

[59]  Maguelonne Teisseire,et al.  Towards a new approach for mining frequent itemsets on data stream , 2007, Journal of Intelligent Information Systems.

[60]  Surajit Chaudhuri,et al.  Dynamic sample selection for approximate query processing , 2003, SIGMOD '03.

[61]  Toon Calders,et al.  Mining Frequent Itemsets in a Stream , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[62]  Srinivasan Parthasarathy,et al.  Discovering frequent topological structures from graph datasets , 2005, KDD '05.

[63]  Won Suk Lee,et al.  Efficient mining method for retrieving sequential patterns over online data streams , 2005, J. Inf. Sci..

[64]  Maguelonne Teisseire,et al.  Need For Speed : Mining Sequential Patterns in Data Streams , 2005, BDA.

[65]  Hung Son Nguyen,et al.  Sequential Pattern Mining from Stream Data , 2011, ADMA.

[66]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[67]  Wei Wang,et al.  Mining protein family specific residue packing patterns from protein structure graphs , 2004, RECOMB.

[68]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[69]  Divesh Srivastava,et al.  Finding Hierarchical Heavy Hitters in Data Streams , 2003, VLDB.

[70]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[71]  Ruoming Jin,et al.  Systematic Approach for Optimizing Complex Mining Tasks on Multiple Databases , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[72]  Moses Charikar,et al.  Finding frequent items in data streams , 2004, Theor. Comput. Sci..

[73]  Bart Goethals,et al.  Advances in frequent itemset mining implementations: report on FIMI'03 , 2004, SKDD.

[74]  Vasudha Bhatnagar,et al.  Mining Closed Itemsets in Data Stream Using Formal Concept Analysis , 2010, DaWak.

[75]  Jayadev Misra,et al.  Finding Repeated Elements , 1982, Sci. Comput. Program..

[76]  Wilfred Ng,et al.  Maintaining frequent closed itemsets over a sliding window , 2008, Journal of Intelligent Information Systems.

[77]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[78]  Christie I. Ezeife,et al.  SSM : A Frequent Sequential Data Stream Patterns Miner , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[79]  Yuni Xia,et al.  Hyper-structure mining of frequent patterns in uncertain data streams , 2012, Knowledge and Information Systems.

[80]  Suh-Yin Lee,et al.  An Efficient Algorithm for Mining Frequent Itemests over the Entire History of Data Streams , 2004 .

[81]  Florent Masseglia,et al.  Mining sequential patterns from data streams: a centroid approach , 2006, Journal of Intelligent Information Systems.

[82]  Bin Chen,et al.  Efficient data reduction with EASE , 2003, KDD '03.

[83]  Charu C. Aggarwal,et al.  XRules: an effective structural classifier for XML data , 2003, KDD '03.

[84]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[85]  Osmar R. Zaïane,et al.  Incremental mining of frequent patterns without candidate generation or support constraint , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[86]  Manoranjan Dash,et al.  Efficient Approximate Mining of Frequent Patterns over Transactional Data Streams , 2008, DaWaK.

[87]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[88]  Carson Kai-Sang Leung,et al.  Frequent Pattern Mining from Time-Fading Streams of Uncertain Data , 2011, DaWaK.

[89]  Hongjun Lu,et al.  False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams , 2004, VLDB.