Detecting Changes in Rare Patterns from Data Streams

Current drift detection techniques in data streams focus on finding changes in streams with labeled data intended for supervised machine learning methods. Up to now there has been no research that considers drift detection on item based data streams with unlabeled data intended for unsupervised association rule mining. In this paper we address and discuss the current issues in performing drift detection of rare patterns in data streams and present a working approach that enables the detection of rare pattern changes. We propose a novel measure, called the M measure, that facilitates pattern change detection and through our experiments we show that this measure can be used to detect changes in rare patterns in data streams efficiently and accurately.

[1]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[2]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[3]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[4]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[5]  Gillian Dobbie,et al.  Rare Pattern Mining on Data Streams , 2012, DaWaK.

[6]  Lei Wu,et al.  Rare Itemset Mining , 2007, ICMLA 2007.

[7]  Harry Wechsler,et al.  A Martingale Framework for Detecting Changes in Data Streams by Testing Exchangeability , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Alessandra Russo,et al.  Advances in Artificial Intelligence – SBIA 2004 , 2004, Lecture Notes in Computer Science.

[9]  Ricard Gavaldà,et al.  Mining adaptively frequent closed unlabeled rooted trees in data streams , 2008, KDD.

[10]  Gillian Dobbie,et al.  RP-Tree: Rare Pattern Tree Mining , 2011, DaWaK.

[11]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[12]  Young-Koo Lee,et al.  Efficient frequent pattern mining over data streams , 2008, CIKM '08.

[13]  Amedeo Napoli,et al.  Towards Rare Itemset Mining , 2007 .

[14]  Tai-Wen Yue,et al.  A Q'tron Neural-Network Approach to Solve the Graph Coloring Problems , 2007 .

[15]  Yun Sing Koh,et al.  Finding Sporadic Rules Using Apriori-Inverse , 2005, PAKDD.

[16]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[17]  Shamkant B. Navathe,et al.  Text Mining and Ontology Applications in Bioinformatics and GIS , 2007, International Conference on Machine Learning and Applications.

[18]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[19]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[20]  Thorsten Joachims,et al.  Detecting Concept Drift with Support Vector Machines , 2000, ICML.

[21]  Man Lung Yiu,et al.  Group-by skyline query processing in relational engines , 2009, CIKM.