An Extended Model for Full Text Databases

Query languages of full text retrieval systems are based on several assumptions about the input text. Many systems assume that the input has some basic structure (say, documents and words) which restricts the application domain. Others only allow to retrieve documents and not text positions. We address these problems and we propose an extended model for text, and a query language based on it. The query language is powerful enough to be used as an intermediate language to customized and/or intelligent retrieval applications based on visual interfaces. We also discuss several implementation issues giving preliminary experimental results.

[1]  Gaston H. Gonnet,et al.  Mind Your Grammar: a New Approach to Modelling Text , 1987, VLDB.

[2]  Gaston H. Gonnet,et al.  Unstructured data bases or very efficient text searching , 1983, PODS.

[3]  Ramana Rao,et al.  The information grid: a framework for information retrieval and retrieval-centered applications , 1992, UIST '92.

[4]  Anselm Spoerri,et al.  InfoCrystal: a visual tool for information retrieval & management , 1993, CIKM '93.

[5]  Ricardo A. Baeza-Yates,et al.  A language for queries on structure and contents of textual databases , 1995, SIGIR '95.

[6]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[7]  G. H. Gonnet,et al.  Handbook of algorithms and data structures: in Pascal and C (2nd ed.) , 1991 .

[8]  Jean Tague,et al.  A Complete Model for Information Retrieval Systems. , 1991, SIGIR 1991.

[9]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[10]  Darrell R. Raymond,et al.  Playing detective with full text searching software , 1990, SIGDOC '90.

[11]  Gerard Salton,et al.  Another look at automatic text-retrieval systems , 1986, CACM.

[12]  Arjan Loeffen Text databases: a survey of text models and systems , 1994, SGMD.

[13]  Robert R. Korfhage,et al.  Visualization of a Document Collection: The VIBE System , 1993, Inf. Process. Manag..

[14]  Ricardo A. Baeza-Yates,et al.  Integrating contents and structure in text retrieval , 1996, SGMD.

[15]  Wendy A. Lawrence-Fowler,et al.  Integrating query thesaurus, and documents through a common visual representation , 1991, SIGIR '91.

[16]  Gerald Salton,et al.  Automatic text processing , 1988 .

[17]  Gaston H. Gonnet,et al.  New Indices for Text: Pat Trees and Pat Arrays , 1992, Information Retrieval: Data Structures & Algorithms.