Mining State Dependencies Between Multiple Sensor Data Sources

Pattern mining over data streams is critical to a variety of applications such as prediction and evolution of weather phenomena or anomaly detection in security applications. Most of the current techniques attempt to discover associations between events appearing on the same data stream but are not able to discover associations over multiple heterogeneous data streams. In this work, we aim to identify temporal dependencies between data streams. We represent event streams by state streams that are induced by the streams’ events themselves. Each state has a duration, represented as a set of disjoint time intervals with respect to the events that occurred in the stream. Temporal relations between these interval sets infers dependencies between the corresponding datasources. Our interval-based approach is robust to the temporal variability of events that characterizes the time intervals during which the events are related. It links two types of events if the occurrence of one is often followed by the appearance of the other in a certain time interval. The proposed approach determines the most appropriate time intervals of a temporal dependency whose validity is assessed by a χ test. As several intervals may redundantly describe the same dependency, the approach retrieves only the few most specific intervals with respect to a dominance relationship over temporal dependencies, and thus avoids the classical problem of pattern flooding in data mining. TEDDY algorithm, TEmporal Dependency DiscoverY, prunes the search space while certifying the discovery of all valid and significant temporal dependencies. We present empirical results on simulated data to show the scalability and the robustness

[1]  Ada Wai-Chee Fu,et al.  Discovering Temporal Patterns for Interval-Based Events , 2000, DaWaK.

[2]  Myra Spiliopoulou,et al.  On exploiting the power of time in data mining , 2008, SKDD.

[3]  Yixin Chen,et al.  Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams , 2005, Distributed and Parallel Databases.

[4]  Dino Pedreschi,et al.  Efficient Mining of Temporally Annotated Sequences , 2006, SDM.

[5]  Fosca Giannotti,et al.  Temporal mining for interactive workflow data analysis , 2009, KDD.

[6]  Mong-Li Lee,et al.  Mining relationships among interval-based events for classification , 2008, SIGMOD Conference.

[7]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .

[8]  Chedy Raïssi,et al.  Mining Multidimensional Sequential Patterns over Data Streams , 2008, DaWaK.

[9]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[10]  Liang Tang,et al.  Discovering lag intervals for temporal dependencies , 2012, KDD.

[11]  Jiawei Han,et al.  Stream Sequential Pattern Mining with Precise Error Bounds , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[12]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[13]  Shinichi Morishita,et al.  Transversing itemset lattices with statistical metric pruning , 2000, PODS '00.

[14]  John F. Roddick,et al.  ARMADA - An algorithm for discovering richer relative temporal association rules from interval-based data , 2007, Data Knowl. Eng..

[15]  Yen-Liang Chen,et al.  Mining Nonambiguous Temporal Patterns for Interval-Based Events , 2007, IEEE Transactions on Knowledge and Data Engineering.

[16]  Johannes Gehrke,et al.  Cayuga: A General Purpose Event Monitoring System , 2007, CIDR.

[17]  Eamonn J. Keogh,et al.  Exact indexing of dynamic time warping , 2002, Knowledge and Information Systems.

[18]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[19]  Philip S. Yu,et al.  On dense pattern mining in graph streams , 2010, Proc. VLDB Endow..

[20]  Eamonn J. Keogh,et al.  Experimental comparison of representation methods and distance measures for time series data , 2010, Data Mining and Knowledge Discovery.

[21]  Dino Pedreschi,et al.  Unveiling the complexity of human mobility by querying and mining massive trajectory data , 2011, The VLDB Journal.

[22]  Diane J. Cook,et al.  Mining Sensor Streams for Discovering Human Activity Patterns over Time , 2010, 2010 IEEE International Conference on Data Mining.

[23]  Ruoming Jin,et al.  Frequent Pattern Mining in Data Streams , 2007, Frequent Pattern Mining.

[24]  Chris Jermaine,et al.  Finding the most interesting correlations in a database: how hard can it be? , 2005, Inf. Syst..

[25]  Ugur Çetintemel,et al.  Plan-based complex event detection across distributed sources , 2008, Proc. VLDB Endow..

[26]  Diane J. Cook,et al.  Using Association Rule Mining to Discover Temporal Relations of Daily Activities , 2011, ICOST.

[27]  Elke A. Rundensteiner,et al.  Constraint-Aware Complex Event Pattern Detection over Streams , 2010, DASFAA.

[28]  Lei Chang,et al.  SeqStream: Mining Closed Sequential Patterns over Stream Sliding Windows , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[29]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[30]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[31]  Diane J Cook,et al.  Assessing the Quality of Activities in a Smart Environment , 2009, Methods of Information in Medicine.

[32]  Elke A. Rundensteiner,et al.  Complex event pattern detection over streams with interval-based temporal semantics , 2011, DEBS '11.

[33]  Frank Klawonn,et al.  Finding informative rules in interval sequences , 2001, Intell. Data Anal..

[34]  Jun'ichi Tatemura,et al.  Runtime Semantic Query Optimization for Event Stream Processing , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[35]  Tamara G. Kolda,et al.  Mining large graphs and streams using matrix and tensor tools , 2007, SIGMOD '07.

[36]  Lawrence B. Holder,et al.  Discovering Activities to Recognize and Track in a Smart Environment , 2011, IEEE Transactions on Knowledge and Data Engineering.

[37]  Avishek Saha,et al.  Sequential Dependencies , 2009, Proc. VLDB Endow..

[38]  Wang Ben-nian Frequent Pattern Mining in Data Streams , 2007 .

[39]  Xindong Wu,et al.  Sequential pattern mining in multiple streams , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[40]  A. Akhmetova Discovery of Frequent Episodes in Event Sequences , 2006 .

[41]  Elke A. Rundensteiner,et al.  Sequence Pattern Query Processing over Out-of-Order Event Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[42]  Ruoming Jin,et al.  An algorithm for in-core frequent itemset mining on streaming data , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[43]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[44]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[45]  Marc Plantevit,et al.  Mining Graph Topological Patterns: Finding Covariations among Vertex Descriptors , 2013, IEEE Transactions on Knowledge and Data Engineering.

[46]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[47]  Charu C. Aggarwal,et al.  Data Streams: Models and Algorithms (Advances in Database Systems) , 2006 .