Probabilistic Stream Relational Algebra: A Data Model for Sensor Data Streams

Abstract : Sensor data streams exhibit special characteristics such as inherent information uncertainty and inherent data sample correlations, both within and across streams. We introduce a new data model, called Probabilistic Stream Relational Algebra (PSRA), that models a sensor data stream as a set of probabilistic data samples, along with prediction strategies for each attributes, capturing domain knowledge of inherent data correlations. We also explicitly associate every operation with schedule, specifying when next data sample should be produced, to facilitate resource management in sensor networks. We prove that operators in PSRA are non-blocking, thus making PSRA especially suitable for data stream processing. We also show that conventional relational model and existing deterministic data stream processing model can be modeled in PSRA.

[1]  Yaakov Bar-Shalom,et al.  Estimation and Tracking: Principles, Techniques, and Software , 1993 .

[2]  DeyDebabrata,et al.  A probabilistic relational model and algebra , 1996 .

[3]  Laks V. S. Lakshmanan,et al.  ProbView: a flexible probabilistic database system , 1997, TODS.

[4]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[5]  Hector Garcia-Molina,et al.  The Management of Probabilistic Data , 1992, IEEE Trans. Knowl. Data Eng..

[6]  Torsten Suel,et al.  Optimal Histograms with Quality Guarantees , 1998, VLDB.

[7]  Robert B. Ross,et al.  Probabilistic temporal databases, I: algebra , 2001, TODS.

[8]  George J. Klir,et al.  Fuzzy sets, uncertainty and information , 1988 .

[9]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[10]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[11]  Sumit Sarkar,et al.  A probabilistic relational model and algebra , 1996, TODS.

[12]  Philippe Bonnet,et al.  Querying the physical world , 2000, IEEE Wirel. Commun..

[13]  M. J. Wheeler Heat and Mass Transfer , 1968, Nature.

[14]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[15]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[16]  Y. Rozanov Probability Theory, Random Processes and Mathematical Statistics , 2011 .

[17]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[18]  Jaideep Srivastava,et al.  Analytical modeling of materialized view maintenance , 1988, PODS '88.

[19]  Kyuseok Shim,et al.  Approximate query processing using wavelets , 2001, The VLDB Journal.

[20]  Eugene Wong,et al.  A statistical approach to incomplete information in database systems , 1982, TODS.

[21]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[22]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[23]  Michael Pittarelli,et al.  The Theory of Probabilistic Databases , 1987, VLDB.

[24]  Inderpal Singh Mumick,et al.  Maintenance of Materialized Views: Problems, Techniques, and Applications , 1999, IEEE Data Eng. Bull..

[25]  Philippe Bonnet,et al.  GADT: a probability space ADT for representing and querying the physical world , 2002, Proceedings 18th International Conference on Data Engineering.

[26]  Jennifer Widom,et al.  CQL: A Language for Continuous Queries over Streams and Relations , 2003, DBPL.

[27]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[28]  Eric N. Hanson,et al.  A performance analysis of view materialization strategies , 1987, SIGMOD '87.

[29]  Abraham Silberschatz,et al.  View maintenance issues for the chronicle data model (extended abstract) , 1995, PODS.

[30]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[31]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[32]  Theodore Johnson,et al.  Gigascope: a stream database for network applications , 2003, SIGMOD '03.

[33]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .