A language for queries on structure and contents of textual databases

We present a model for querying textual databases by both the structure and contents of the text. Our goal is to obtain a query language which is expressive enough in practice while being efficiently implementable, features not present at the same time in previous work. We evaluate our model regarding expressivity and efficiency. The key idea of the model is that a set-oriented query language based on operations on nearby structure elements of one or more hierarchi es is quite expressive and efficiently implementable, being a good tradeoff between both goals.

[1]  L. R. Rasmussen,et al.  In information retrieval: data structures and algorithms , 1992 .

[2]  Forbes J. Burkowski Retrieval activities in a database consisting of heterogeneous collections of structured text , 1992, SIGIR '92.

[3]  Heikki Mannila,et al.  Retrieval from hierarchical texts by partial patterns , 1993, SIGIR.

[4]  Charles L. A. Clarke,et al.  An Algebra for Structured Text Search and a Framework for its Implementation , 1995, Comput. J..

[5]  Serge Abiteboul,et al.  From structured documents to novel query facilities , 1994, SIGMOD '94.

[6]  Roger King,et al.  Semantic database modeling: survey, applications, and research issues , 1987, CSUR.

[7]  Bipin C. Desai,et al.  A Data Model for Use with Formatted and Textual Data. , 1986 .

[8]  Elisa Bertino,et al.  Query processing in a multimedia document system , 1988, TOIS.

[9]  Scott C. Deerwester,et al.  A textual object management system , 1992, SIGIR '92.

[10]  - GonzaloNavarroRicardoBaeza,et al.  Expressive Power ofa New Model for Structured Text Databases , 1995 .

[11]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[12]  Heather Fawcett,et al.  The "New Oxford English Dictionary" Project. , 1993 .

[13]  C. J. Date An Introduction to Database Systems , 1975 .

[14]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[15]  Forbes J. Burkowski,et al.  An Algebra for Hierarchically Organized Text-Dominate Databases , 1992, Inf. Process. Manag..

[16]  Jeff Conklin,et al.  Hypertext: An Introduction and Survey , 1987, Computer.

[17]  Bipin C. Desai,et al.  A data model for use with formatted and textual data , 1986, J. Am. Soc. Inf. Sci..

[18]  Heikki Mannila,et al.  Grammatical Tree Matching , 1992, CPM.

[19]  Arjan Loeffen Text databases: a survey of text models and systems , 1994, SGMD.

[20]  Alberto O. Mendelzon,et al.  Hy+: a Hygraph-based query and visualization system , 1993, SIGMOD '93.

[21]  Jean Tague-Sutcliffe,et al.  Complete formal model for information retrieval systems , 1991, SIGIR '91.

[22]  Gaston H. Gonnet,et al.  Mind Your Grammar: a New Approach to Modelling Text , 1987, VLDB.

[23]  Marc Gyssens,et al.  A grammar-based approach towards unifying hierarchical data models , 1989, SIGMOD '89.

[24]  Charles F. Goldfarb,et al.  SGML handbook , 1990 .

[25]  Tova Milo,et al.  Algebras for querying text regions (extended abstract) , 1995, PODS.

[26]  Ian A. Macleod A Query Language for Retrieving Information from Hierarchic Text Structures , 1991, Comput. J..

[27]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[28]  Tova Milo,et al.  Optimizing queries on files , 1994, SIGMOD '94.

[29]  Michael Stonebraker,et al.  Document processing in a relational database system , 1983, TOIS.

[30]  Gerald Salton,et al.  Automatic text processing , 1988 .

[31]  Ron Sacks-Davis,et al.  Database Systems for Structured Documents , 1995, IEICE Trans. Inf. Syst..

[32]  Won Kim,et al.  Object-Oriented Concepts, Databases, and Applications , 1989 .

[33]  Gerard Salton,et al.  Another look at automatic text-retrieval systems , 1986, CACM.

[34]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[35]  Charles L. A. Clarke,et al.  Schema-Independent Retrieval from Heterogeneous Structured Text , 1994 .

[36]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[37]  Pekka Kilpeläinen,et al.  Tree Matching Problems with Applications to Structured Text Databases , 2022 .

[38]  Frank Wm. Tompa,et al.  Shortening the OED: experience with a grammar-defined database , 1992, TOIS.