Managing syntactic variation in text retrieval

Information Retrieval systems are limited by the linguistic variation of language. The use of Natural Language Processing techniques to manage this problem has been studied for a long time, but mainly focusing on English. In this paper we deal with European languages, taking Spanish as a case in point. Two different sources of syntactic information, queries and documents, are studied in order to increase the performance of Information Retrieval systems.

[1]  Avi Arampatzis,et al.  Linguistically Motivated Information Retrieval , 2000 .

[2]  Miguel A. Alonso,et al.  A Grammatical Approach to the Extraction of Index Terms , 2003 .

[3]  Douglas E. Appelt,et al.  FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text , 1997, ArXiv.

[4]  Tomek Strzalkowski Natural Language Information Retrieval , 1995, Inf. Process. Manag..

[5]  Chris Buckley,et al.  Implementation of the SMART Information Retrieval System , 1985 .

[6]  Steven P. Abney Partial parsing via finite-state cascades , 1996, Natural Language Engineering.

[7]  Miguel A. Alonso,et al.  COLE Experiments at CLEF 2002: Spanish Monolingual Track , 2002, CLEF.

[8]  Hinrich Schütze,et al.  Xerox Site Report: Four TREC-4 Tracks , 1995, TREC.

[9]  Miguel A. Alonso,et al.  On the Usefulness of Extracting Syntactic Dependencies for Text Indexing , 2002, AICS.

[10]  Miguel A. Alonso,et al.  Applying Productive Derivational Morphology to Term Indexing of Spanish Texts , 2001, CICLing.

[11]  S. Griffis EDITOR , 1997, Journal of Navigation.

[12]  Jorge Graña Gil Técnicas de Análisis Sintáctico Robusto para la Etiquetación del Lenguaje Natural , 2002, Proces. del Leng. Natural.

[13]  Evelyne Tzoukermann,et al.  NLP for Term Variant Extraction: Synergy Between Morphology, Lexicon, and Syntax , 1999 .

[14]  Emmanuel Roche,et al.  Finite-State Language Processing , 1997 .

[15]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .