Extensions of Attribute Grammars for Structured Document Queries

Document specification languages like XML, model documents using extended context-free grammars. These differ from standard context-free grammars in that they allow arbitrary regular expressions on the right-hand side of productions. To query such documents, we introduce a new form of attribute grammars (extended AGs) that work directly over extended context-free grammars rather than over standard context-free grammars. Viewed as a query language, extended AGs are particularly relevant as they can take into account the inherent order of the children of a node in a document. We show that two key properties of standard attribute grammars carry over to extended AGs: efficiency of evaluation and decidability of well-definedness. We further characterize the expressiveness of extended AGs in terms of monadic second-order logic and establish the complexity of their non-emptiness and equivalence problem to be complete for EXPTIME. As an application we show that the Region Algebra expressions can be efficiently translated into extended AGs. This translation drastically improves the known upper bound on the complexity of the emptiness and equivalence test for Region Algebra expressions.

[1]  Alain Quéré,et al.  Définition et Etude des Bilangages Réguliers , 1968, Inf. Control..

[2]  Shimon Even,et al.  Ambiguity in Graphs and Expressions , 1971, IEEE Transactions on Computers.

[3]  M. Jazayeri,et al.  The intrinsically exponential complexity of the circularity problem for attribute grammars , 1975, CACM.

[4]  Masako Takahashi,et al.  Generalizations of Regular Sets and Their Applicatin to a Study of Context-Free Languages , 1975, Inf. Control..

[5]  Editors , 1986, Brain Research Bulletin.

[6]  Bogdan S. Chlebus Domino-Tiling Games , 1986, J. Comput. Syst. Sci..

[7]  Gaston H. Gonnet,et al.  Mind Your Grammar: a New Approach to Modelling Text , 1987, VLDB.

[8]  Pierre Deransart,et al.  Attribute Grammars: Definitions, Systems and Bibliography , 1988 .

[9]  R. Lathe Phd by thesis , 1988, Nature.

[10]  Bruno Courcelle,et al.  Proofs of Partial Correctness for Attribute Grammars with Applications to Recursive Procedures and Logic Programming , 1988, Inf. Comput..

[11]  Moshe Y. Vardi Invited talk: automata theory for database theoreticians , 1989, PODS '89.

[12]  Marc Gyssens,et al.  A grammar-based approach towards unifying hierarchical data models , 1989, SIGMOD '89.

[13]  Y. Gurevich On Finite Model Theory , 1990 .

[14]  Michael Kifer,et al.  Deductive and Object-Oriented Databases , 1991 .

[15]  Moshe Y. Vardi Automata Theory for Database Theoreticans , 1991, Theoretical Studies in Computer Science.

[16]  Heikki Mannila,et al.  Retrieval from hierarchical texts by partial patterns , 1993, SIGIR.

[17]  Heikki Mannila,et al.  Query Primitives for Tree-Structured Data , 1994, CPM.

[18]  Derick Wood,et al.  Standard Generalized Markup Language: Mathematical and Philosophical Issues , 1995, Computer Science Today.

[19]  Masaki Murata,et al.  Forest-regular languages and tree-regular languages , 1995 .

[20]  David Harel,et al.  Complexity Results for Two-Way and Multi-Pebble Automata and their Logics , 1996, Theor. Comput. Sci..

[21]  Ricardo A. Baeza-Yates,et al.  Integrating contents and structure in text retrieval , 1996, SGMD.

[22]  Joost Engelfriet,et al.  Characterization of Properties and Relations defined in Monadic Second Order Logic on the Nodes of T , 1997 .

[23]  Wolfgang Thomas,et al.  Languages, Automata, and Logic , 1997, Handbook of Formal Languages.

[24]  Grzegorz Rozenberg,et al.  Structures in Logic and Computer Science , 1997, Lecture Notes in Computer Science.

[25]  Ferenc Gécseg,et al.  Tree Languages , 1997, Handbook of Formal Languages.

[26]  Frank Neven,et al.  On Implementing Structured Document Query Facilities on Top of a DOOD , 1997, DOOD.

[27]  Joost Engelfriet,et al.  Monadic Second Order Logic and Node Relations on Graphs and Trees , 1997, Structures in Logic and Computer Science.

[28]  Serge Abiteboul,et al.  Querying Semi-Structured Data , 1997, Encyclopedia of Database Systems.

[29]  Peter Buneman,et al.  Semistructured data , 1997, PODS.

[30]  Helmut Seidl,et al.  Locating Matches of Tree Patterns in Forests , 1998, FSTTCS.

[31]  Alin Deutsch,et al.  XML-QL: A Query Language for XML , 1998 .

[32]  Frank Neven,et al.  Expressiveness of structured document query languages based on attribute grammars , 1998, JACM.

[33]  Tova Milo,et al.  Algebras for Querying Text Regions: Expressive Power and Optimization , 1998, J. Comput. Syst. Sci..

[34]  Derick Wood,et al.  One-Unambiguous Regular Languages , 1998, Inf. Comput..

[35]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[36]  Frank Neven Structured Document Query Languages Based on Attribute Grammars: Locality and Non-Determinism , 1998, FMLDO.

[37]  Derick Wood,et al.  One-Unambiguous Regular Languages , 1998, Inf. Comput..

[38]  Serge Abiteboul,et al.  A logical view of structured files , 1998, The VLDB Journal.

[39]  Dan Suciu,et al.  Semistructured Data and XML , 2001, FODO.

[40]  Derick Wood,et al.  Regular Tree Languages Over Non-Ranked Alphabets , 1998 .

[41]  Makoto Murata Data Model for Document Transformation and Assembly , 1998, PODDP.

[42]  Valter Crescenzi,et al.  Grammars Have Exceptions , 1998, Inf. Syst..

[43]  Yannis Papakonstantinou,et al.  View Definition and DTD Inference for XML , 1999 .

[44]  Frank Neven Design and analysis of query languages for structured documents. A formal and logical approach , 1999 .

[45]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[46]  Thomas Schwentick,et al.  Query automata , 1999, PODS '99.

[47]  Catriel Beeri,et al.  Schemas for Integration and Translation of Structured and Semi-structured Data , 1999, ICDT.

[48]  Yannis Papakonstantinou,et al.  DTD inference for views of XML data , 2000, PODS.