Retrieval from hierarchical texts by partial patterns

Structured texts (for example dictionaries and user manuals) typically have a heirarchical (tree-like) structure. We describe a query language for retrieving information from collections of hierarchical text. The language is based on a tree pattern matching notion called tree inclusion. Tree inclusion allows easy expression of queries that use the structure and the content of the document. In using it a user need not be aware of the whole structure of the database. Thus a language based on tree inclusion is data independent, a property made necessary because of the great variance in the structure of the texts.

[1]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[2]  Pekka Kilpeläinen,et al.  Tree Matching Problems with Applications to Structured Text Databases , 2022 .

[3]  Heikki Mannila,et al.  The Tree Inclusion Problem , 1991, TAPSOFT, Vol.1.

[4]  Frank Wm. Tompa,et al.  Shortening the OED: experience with a grammar-defined database , 1992, TOIS.

[5]  Forbes J. Burkowski Retrieval activities in a database consisting of heterogeneous collections of structured text , 1992, SIGIR '92.

[6]  Vincent Quint,et al.  By way of an introduction. Structured documents: what and why? , 1989 .

[7]  Ian A. Macleod A Query Language for Retrieving Information from Hierarchic Text Structures , 1991, Comput. J..

[8]  V. Quint,et al.  Text processing and document manipulation: Grif: An Interactive System for Structured Document Manipulation , 1986 .

[9]  Ian A. Macleod,et al.  Storage and retrieval of structured documents , 1990, Inf. Process. Manag..

[10]  Heikki Mannila,et al.  Grammatical Tree Matching , 1992, CPM.

[11]  Elisa Bertino,et al.  Object-Oriented Query Languages: The Notion and the Issues , 1992, IEEE Trans. Knowl. Data Eng..

[12]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[13]  Forbes J. Burkowski,et al.  An Algebra for Hierarchically Organized Text-Dominate Databases , 1992, Inf. Process. Manag..

[14]  Leon Sterling,et al.  The Art of Prolog , 1987, IEEE Expert.

[15]  David Maier,et al.  Computing with Logic: Logic Programming with Prolog , 1988 .

[16]  Charles F. Goldfarb,et al.  SGML handbook , 1990 .

[17]  V. Quint Systems for the manipulation of structured documents , 1989 .

[18]  David H. D. Warren,et al.  Efficient Processing of Interactive Relational Data Base Queries expressed in Logic , 1981, VLDB.

[19]  Jean Tague-Sutcliffe,et al.  Complete formal model for information retrieval systems , 1991, SIGIR '91.

[20]  Vincent Quint,et al.  Structured documents , 1989 .

[21]  Michael Kifer,et al.  Querying object-oriented databases , 1992, SIGMOD '92.

[22]  Dr.rer. nat. Wolfgang Appelt Document Architecture in Open Systems: The ODA Standard , 1991, Springer Berlin Heidelberg.

[23]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[24]  C. J. Date A Guide to the SQL Standard , 1987 .