Querying streams using regular expressions: some semantics, decidability, and efficiency issues

This paper analyzes the decidability and complexity problems that arise when matching regular expressions on infinite streams of sets of symbols. We show that in important application domains, several apparently obvious semantics lead to detecting spurious events (events that are mere artifacts of the semantics) or to missing events of potential interest. We single out a class of semantics, of interest in many applications, which we dub use-and-throw: In a use-and-throw semantics, an elementary event can participate in the creation of at most one detected complex event. Many areas of research have identified this as a desirable requirement (we give the examples of databases and video surveillance), but hitherto there has been no systematic study of the characteristics of these semantics, in particular their decidability and algorithmic complexity. This paper is meant to provide at least some initial answers on this subject. We analyze several semantics, provide polynomial algorithms for them, and prove their correctness and their properties.

[1]  E. Allen Emerson,et al.  Tree automata, mu-calculus and determinacy , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[2]  Robert S. Streett,et al.  Propositional Dynamic Logic of Looping and Converse Is Elementarily Decidable , 1982, Inf. Control..

[3]  Hector J. Levesque,et al.  Foundations for the Situation Calculus , 1998, Electron. Trans. Artif. Intell..

[4]  J. Büchi Weak Second‐Order Arithmetic and Finite Automata , 1960 .

[5]  Ken Thompson,et al.  Programming Techniques: Regular expression search algorithm , 1968, Commun. ACM.

[6]  Yihong Gong,et al.  Action detection in complex scenes with spatial and temporal ambiguities , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Limsoon Wong,et al.  Unary Quantifiers, Transitive Closure, and Relations of Large Degree , 1998, STACS.

[8]  Alberto O. Mendelzon,et al.  Finding Regular Simple Paths in Graph Databases , 1989, SIAM J. Comput..

[9]  M. Rabin Decidability of second-order theories and automata on infinite trees , 1968 .

[10]  Sharma Chakravarthy,et al.  Snoop: An Expressive Event Specification Language for Active Databases , 1994, Data Knowl. Eng..

[11]  Gang Hua,et al.  Semantic Model Vectors for Complex Video Event Recognition , 2012, IEEE Transactions on Multimedia.

[12]  J. R. Büchi,et al.  Solving sequential conditions by finite-state strategies , 1969 .

[13]  Wolfgang Thomas,et al.  Automata on Infinite Objects , 1991, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[14]  Divesh Srivastava,et al.  Data stream query processing , 2005, 21st International Conference on Data Engineering (ICDE'05).

[15]  Narain H. Gehani,et al.  COMPOSE: A System For Composite Specification And Detection , 1993, Advanced Database Systems.

[16]  Gultekin Özsoyoglu,et al.  A graph query language and its query processing , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[17]  Sheng Yu,et al.  A Formal Study Of Practical Regular Expressions , 2003, Int. J. Found. Comput. Sci..

[18]  Koushik Sen,et al.  Generating Optimal Monitors for Extended Regular Expressions , 2003, RV@CAV.

[19]  Insup Lee,et al.  Simulation of Simultaneous Events in Regular Expressions for Run-Time Verification , 2004, RV@ETAPS.

[20]  Mubarak Shah,et al.  Learning, detection and representation of multi-agent events in videos , 2007, Artif. Intell..

[21]  Neil Immerman,et al.  On Supporting Kleene Closure over Event Streams , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[22]  Alberto Del Bimbo,et al.  Symbolic Description and Visual Querying of Image Sequences Using Spatio-Temporal Logic , 1995, IEEE Trans. Knowl. Data Eng..

[23]  M. Rabin Automata on Infinite Objects and Church's Problem , 1972 .

[24]  J. Glenn Brookshear,et al.  Theory of Computation: Formal Languages, Automata, and Complexity , 1989 .

[25]  Rainer Unland,et al.  On the semantics of complex events in active database management systems , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[26]  Daniele Braga,et al.  C-SPARQL: a Continuous Query Language for RDF Data Streams , 2010, Int. J. Semantic Comput..

[27]  Maurice Bruynooghe,et al.  Temporal Reasoning with Abductive Event Calculus , 1992, ECAI.

[28]  Mahesh Viswanathan,et al.  Testing Extended Regular Language Membership Incrementally by Rewriting , 2003, RTA.

[29]  J. R. Büchi On a Decision Method in Restricted Second Order Arithmetic , 1990 .

[30]  Rune Hjelsvold,et al.  Modelling and Querying Video Data , 1994, VLDB.

[31]  Jennifer Widom,et al.  Operator placement for in-network stream query processing , 2005, PODS.

[32]  Dan Suciu,et al.  Processing XML Streams with Deterministic Automata , 2003, ICDT.

[33]  Alfred V. Aho,et al.  Algorithms for Finding Patterns in Strings , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[34]  Thomas Schwentick,et al.  On the power of tree-walking automata , 2000, Inf. Comput..

[35]  Sharma Chakravarthy,et al.  Composite Events for Active Databases: Semantics, Contexts and Detection , 1994, VLDB.

[36]  David E. Muller,et al.  Infinite sequences and finite machines , 1963, SWCT.

[37]  Kim S. Larsen Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy , 1998, Inf. Process. Lett..

[38]  Frederick Reiss,et al.  TelegraphCQ: continuous dataflow processing , 2003, SIGMOD '03.

[39]  Gérard Berry,et al.  From Regular Expressions to Deterministic Automata , 1986, Theor. Comput. Sci..

[40]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[41]  Narain H. Gehani,et al.  The Ode active database: trigger semantics and implementation , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[42]  Balder ten Cate,et al.  The expressivity of XPath with transitive closure , 2006, PODS.

[43]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[44]  Marcel Worring,et al.  Complex Visual Activity Recognition Using a Temporally Ordered Database , 1999, VISUAL.

[45]  Giora Slutzki,et al.  Parallel and Two-Way Automata on Directed Ordered Acyclic Graphs , 1981, Inf. Control..

[46]  Madhavan Mukund,et al.  Finite-State Automata on Infinite Inputs , 2012, Modern Applications of Automata Theory.

[47]  XieLexing,et al.  Semantic Model Vectors for Complex Video Event Recognition , 2012 .

[48]  Klaus R. Dittrich,et al.  Detecting composite events in active database systems using Petri nets , 1994, Proceedings of IEEE International Workshop on Research Issues in Data Engineering: Active Databases Systems.

[49]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[50]  J. Paul Tremblay,et al.  Discrete Mathematical Structures with Applications to Computer Science , 1975 .

[51]  Carlo Zaniolo,et al.  Temporal aggregation in active database rules , 1997, SIGMOD '97.

[52]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[53]  Frank Neven,et al.  Automata, Logic, and XML , 2002, CSL.

[54]  Simone Santini,et al.  Regular languages with variables on graphs , 2012, Inf. Comput..

[55]  Hanêne Ben-Abdallah,et al.  Formally specified monitoring of temporal properties , 1999, Proceedings of 11th Euromicro Conference on Real-Time Systems. Euromicro RTS'99.

[56]  Wolfgang Thomas,et al.  A Combinatorial Approach to the Theory of omega-Automata , 1981, Inf. Control..

[57]  Sebastian Rudolph,et al.  EP-SPARQL: a unified language for event processing and stream reasoning , 2011, WWW.

[58]  Danh Le Phuoc,et al.  A Native and Adaptive Approach for Unified Processing of Linked Streams and Linked Data , 2011, SEMWEB.

[59]  Ron Koymans,et al.  Specifying real-time properties with metric temporal logic , 1990, Real-Time Systems.

[60]  Joseph M. Hellerstein,et al.  Optimization techniques for queries with expensive methods , 1998, TODS.

[61]  Jennifer Widom,et al.  CQL: A Language for Continuous Queries over Streams and Relations , 2003, DBPL.

[62]  James M. Rehg,et al.  Temporal causality for the analysis of visual events , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[63]  Neil Immerman,et al.  Efficient pattern matching over event streams , 2008, SIGMOD Conference.

[64]  Junsong Yuan,et al.  Optimal spatio-temporal path discovery for video event detection , 2011, CVPR 2011.