Monitoring Distributed Data Streams through Node Clustering

Monitoring data streams in a distributed system is a challenging problem with profound applications. The task of feature selection (e.g., by monitoring the information gain of various features) is an example of an application that requires special techniques to avoid a very high communication overhead when performed using straightforward centralized algorithms.

[1]  R. Gray Entropy and Information Theory , 1990, Springer New York.

[2]  Jacob Kogan,et al.  Feature Selection over Distributed Data Streams through Convex Optimization , 2012, SDM.

[3]  Christos Faloutsos,et al.  Online data mining for co-evolving time sequences , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[4]  Antonios Deligiannakis,et al.  Detecting Outliers in Sensor Networks Using the Geometric Approach , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[5]  Assaf Schuster,et al.  Communication-efficient Outlier Detection for Scale-out Systems , 2013, BD3@VLDB.

[6]  Samuel Madden,et al.  Fjording the stream: an architecture for queries over streaming sensor data , 2002, Proceedings 18th International Conference on Data Engineering.

[7]  Christopher Olston,et al.  Finding (recently) frequent items in distributed data streams , 2005, 21st International Conference on Data Engineering (ICDE'05).

[8]  Boris Mirkin,et al.  Clustering For Data Mining: A Data Recovery Approach (Chapman & Hall/Crc Computer Science) , 2005 .

[9]  R. Saeks,et al.  The analysis of feedback systems , 1972 .

[10]  Assaf Schuster,et al.  A Geometric Approach to Monitoring Threshold Functions over Distributed Data Streams , 2010, Ubiquitous Knowledge Discovery.

[11]  Daniel Keren,et al.  Sketch-based Geometric Monitoring of Distributed Stream Queries , 2013, Proc. VLDB Endow..

[12]  Assaf Schuster,et al.  Shape Sensitive Geometric Monitoring , 2008, IEEE Transactions on Knowledge and Data Engineering.

[13]  Dennis Shasha,et al.  StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[14]  João Gama,et al.  Ubiquitous Knowledge Discovery , 2011, IDA 2011.

[15]  Danny Raz,et al.  Efficient reactive monitoring , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).