DWFIST: Leveraging Calendar-Based Pattern Mining in Data Streams

Calendar-based pattern mining aims at identifying patterns on specific calendar partitions. Potential calendar partitions are for example: every Monday, every first working day of each month, every holiday. Providing flexible mining capabilities for calendar-based partitions is especially challenging in a data stream scenario. The calendar partitions of interest are not known a priori and at each point in time only a subset of the detailed data is available. We show how a data warehouse approach can be applied to this problem. The data warehouse that keeps track of frequent itemsets holding on different partitions of the original stream has low storage requirements. Nevertheless, it allows to derive sets of patterns that are complete and precise. This work demonstrates the effectiveness of our approach by a series of experiments.

[1]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[2]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[3]  Jérôme Darmont,et al.  Processing And Managing Complex Data for Decision Support , 2006 .

[4]  Heikki Mannila,et al.  Multiple Uses of Frequent Sets and Condensed Representations (Extended Abstract) , 1996, KDD.

[5]  Sushil Jajodia,et al.  Discovering calendar-based temporal association rules , 2001, Proceedings Eighth International Symposium on Temporal Representation and Reasoning. TIME 2001.

[6]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[7]  Connolly,et al.  Database Systems , 2004 .

[8]  Ramakrishnan Srikant,et al.  The Quest Data Mining System , 1996, KDD.

[9]  Bernhard Mitschang,et al.  Building the Data Warehouse of Frequent Itemsets in the DWFIST Approach , 2005, ISMIS.

[10]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[11]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[12]  Hong Chen,et al.  MFIS-Mining Frequent Itemsets on Data Streams , 2006, ADMA.

[13]  Holger Schwarz,et al.  DWFIST: The Data Warehouse of Frequent Itemsets Tactics Approach , 2006 .

[14]  Erik D. Demaine,et al.  Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[15]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[16]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[17]  Rajeev Raman,et al.  Algorithms — ESA 2002 , 2002, Lecture Notes in Computer Science.

[18]  Sridhar Ramaswamy,et al.  On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.