Declarative in-network sensor data analysis

Bridging data analysis techniques with classic query processing has long been of interest in the database community. Most approaches, however, are usually developed with a specific domain in mind, e.g. relational, streaming etc., use their own query language, or focus on specific techniques. In this paper, we propose a simple, yet effective, extension to standard or commonly used declarative processing languages to support data mining. Our approach is independent of a particular domain, and by utilizing a query refactoring technique, optimization issues are taken care of by the underlying query processing engine, which is already in place and knows best the setting’s particularities. Therefore, our approach promotes ease of programmability, development, and use of the data mining techniques, with minimal modifications in the query processing stack. We demonstrate our technique through an experimental evaluation, using our prototype system SNEE-A, that runs in-network data analysis given a sensor network deployment, a setting with several critical constraints.

[1]  Charu C. Aggarwal,et al.  Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[2]  Wei Hong,et al.  TinyDB: an acquisitional query processing system for sensor networks , 2005, TODS.

[3]  Surajit Chaudhuri,et al.  Optimization of queries with user-defined predicates , 1996, TODS.

[4]  Katja Hose,et al.  Stream engines meet wireless sensor networks: cost-based planning and processing of complex queries in AnduIN , 2011, Distributed and Parallel Databases.

[5]  Frederick Reiss,et al.  TelegraphCQ: An Architectural Status Report , 2003, IEEE Data Eng. Bull..

[6]  George Valkanas,et al.  Extending query languages for in-network query processing , 2011, MobiDE '11.

[7]  Yong Yao,et al.  The cougar approach to in-network query processing in sensor networks , 2002, SGMD.

[8]  Carlo Zaniolo,et al.  ATLaS: A Native Extension of SQL for Data Mining , 2003, SDM.

[9]  Dimitrios Gunopulos,et al.  Online outlier detection in sensor data using non-parametric models , 2006, VLDB.

[10]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[11]  Jens Palsberg,et al.  Avrora: scalable sensor network simulation with precise timing , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[12]  Jennifer Widom,et al.  STREAM: the stanford stream data manager (demonstration description) , 2003, SIGMOD '03.

[13]  Rajeev Motwani,et al.  Maintaining variance and k-medians over data stream windows , 2003, PODS.

[14]  Christian Y. A. Brenninkmeijer,et al.  SNEE: a query processor for wireless sensor networks , 2011, Distributed and Parallel Databases.

[15]  Sunita Sarawagi,et al.  Integrating association rule mining with relational database systems: alternatives and implications , 1998, SIGMOD '98.

[16]  Carlo Zaniolo,et al.  A native extension of SQL for mining data streams , 2005, SIGMOD '05.

[17]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[18]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[19]  Christian Y. A. Brenninkmeijer,et al.  A Semantics for a Query Language over Sensors, Streams and Relations , 2008, BNCOD.

[20]  Panos K. Chrysanthis,et al.  Workload-Aware Query Routing Trees in Wireless Sensor Networks , 2008, The Ninth International Conference on Mobile Data Management (mdm 2008).

[21]  Stanley B. Zdonik,et al.  Object-Oriented Query Optimization: What''s the Problem? , 1991 .

[22]  Kai-Uwe Sattler,et al.  SQL database primitives for decision tree classifiers , 2001, CIKM '01.

[23]  David E. Culler,et al.  System architecture directions for networked sensors , 2000, SIGP.

[24]  Tomasz Imielinski,et al.  MSQL: A Query Language for Database Mining , 1999, Data Mining and Knowledge Discovery.

[25]  Surajit Chaudhuri Data Mining and Database Systems: Where is the Intersection? , 1998, IEEE Data Eng. Bull..

[26]  Carlos Ordonez,et al.  Integrating K-means clustering with a relational DBMS using SQL , 2006, IEEE Transactions on Knowledge and Data Engineering.

[27]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[28]  Philip Levis,et al.  The nesC language: a holistic approach to networked embedded systems , 2003, SIGP.

[29]  Dimitrios Gunopulos,et al.  Optimizing Query Routing Trees in Wireless Sensor Networks , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[30]  Carlo Zaniolo,et al.  SMM: A data stream management system for knowledge discovery , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[31]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[32]  Hsiao-Hwa Chen,et al.  SHORT: shortest hop routing tree for wireless sensor networks , 2007, Int. J. Sens. Networks.

[33]  Wei Wang,et al.  DMQL: A Data Mining Query Language for Relational Databases , 2007 .