Approximation trade-offs in Markovian stream processing: An empirical study

A large amount of the world's data is both sequential and imprecise. Such data is commonly modeled as Markovian streams; examples include words/sentences inferred from raw audio signals, or discrete location sequences inferred from RFID or GPS data. The rich semantics and large volumes of these streams make them difficult to query efficiently. In this paper, we study the effects-on both efficiency and accuracy-of two common stream approximations. Through experiments on a realworld RFID data set, we identify conditions under which these approximations can improve performance by several orders of magnitude, with only minimal effects on query results. We also identify cases when the full rich semantics are necessary.

[1]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[2]  Christopher Ré,et al.  Access Methods for Markovian Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[3]  Gerd Brewka,et al.  Artificial intelligence - a modern approach by Stuart Russell and Peter Norvig, Prentice Hall. Series in Artificial Intelligence, Englewood Cliffs, NJ , 1996, The Knowledge Engineering Review.

[4]  Prashant J. Shenoy,et al.  Probabilistic Inference over RFID Streams in Mobile Environments , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[5]  Zhitao Shen,et al.  Lineage-based Probabilistic Event Stream Processing , 2008, 2008 Ninth International Conference on Mobile Data Management Workshops, MDMW.

[6]  Henry A. Kautz,et al.  Fine-grained activity recognition by aggregating abstract object usage , 2005, Ninth IEEE International Symposium on Wearable Computers (ISWC'05).

[7]  Samuel Madden,et al.  MauveDB: supporting model-based user views in database systems , 2006, SIGMOD Conference.

[8]  Christopher Ré,et al.  Event queries on correlated probabilistic streams , 2008, SIGMOD Conference.

[9]  Amol Deshpande,et al.  Ef?cient Query Evaluation over Temporally Correlated Probabilistic Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[10]  Wei-Ying Ma,et al.  Image and Video Retrieval , 2003, Lecture Notes in Computer Science.

[11]  Hiroyuki Kitagawa,et al.  Probabilistic Event Stream Processing with Lineage , 2008 .

[12]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[13]  David Wai-Lok Cheung,et al.  OLAP on sequence data , 2008, SIGMOD Conference.

[14]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .