Marrying words and trees

Traditionally, data that has both linear and hierarchical structure, such as annotated linguistic data, is modeled using ordered trees and queried using tree automata. In this paper, we argue that nested words and automata over nested words offer a better way to capture and process the dual structure. Nested words generalize both words and ordered trees, and allow both word and tree operations. We study various classes of automata over nested words, and show that while they enjoy expressiveness and succinctness benefits over word and tree automata, their analysis complexity and closure properties are analogous to the corresponding word and tree special cases. In particular, we show that finite-state nested word automata can be exponentially more succinct than tree automata, and pushdown nested word automata include the two incomparable classes of context-free word languages and context-free tree languages.

[1]  Wolfgang Thomas On Logics, Tilings, and Automata , 1991, ICALP.

[2]  Mahesh Viswanathan,et al.  Congruences for Visibly Pushdown Languages , 2005, ICALP.

[3]  Christof Löding,et al.  Visibly Pushdown Games , 2004, FSTTCS.

[4]  Joachim Niehren,et al.  Minimizing Tree Automata for Unranked Trees , 2005, DBPL.

[5]  R. Alur,et al.  Adding nesting structure to words , 2006, JACM.

[6]  Irène Guessarian,et al.  On Pushdown Tree Automata , 1981, CAAP.

[7]  Swarat Chaudhuri,et al.  Languages of Nested Trees , 2006, CAV.

[8]  Dan Suciu,et al.  Processing XML Streams with Deterministic Automata , 2003, ICDT.

[9]  Thomas Schwentick,et al.  Automata for XML - A survey , 2007, J. Comput. Syst. Sci..

[10]  Frank Neven,et al.  Automata, Logic, and XML , 2002, CSL.

[11]  Donald E. Knuth,et al.  A Characterization of Parenthesis Languages , 1967, Inf. Control..

[12]  Leonid Libkin,et al.  Logics for Unranked Trees: An Overview , 2005, Log. Methods Comput. Sci..

[13]  Mahesh Viswanathan,et al.  Minimization, Learning, and Conformance Testing of Boolean Programs , 2006, CONCUR.

[14]  Rajeev Alur,et al.  Visibly pushdown languages , 2004, STOC '04.

[15]  Robert McNaughton,et al.  Parenthesis Grammars , 1967, JACM.

[16]  Mahesh Viswanathan,et al.  Visibly pushdown automata for streaming XML , 2007, WWW '07.

[17]  Hubert Comon,et al.  Tree automata techniques and applications , 1997 .

[18]  Derick Wood,et al.  Regular tree and regular hedge languages over unranked alphabets , 2001 .