Natural language information retrieval: progress report

Abstract Natural language processing (NLP) techniques may hold a tremendous potential for overcoming the inadequacies of purely quantitative methods of text information retrieval, but the empirical evidence to support such predictions has thus far been inadequate and appropriate scale evaluations have been slow to emerge. In this paper, we report on the progress of the Natural Language Information Retrieval project, a joint effort of several sites led by GE Research and its evaluation the 6th Text Retrieval Conference (TREC-6). In this paper we describe the ‘stream architecture’, a method we designed to combine evidence obtained from several different document representations. Some of the document representations used in the experiments described here involved the use of phrases and proper names computed using Natural Language Processing techniques.

[1]  Tomek Strzalkowski,et al.  Recent Developments in Natural Language Text Retrieval , 1993, TREC.

[2]  Joel L Fagan,et al.  Experiments in Automatic Phrase Indexing For Document Retrieval: A Comparison of Syntactic and Non-Syntactic Methods , 1987 .

[3]  Douglas E. Appelt,et al.  SRI's Tipster II Project , 1996, TIPSTER.

[4]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[5]  Ellen M. Voorhees,et al.  Using WordNet to disambiguate word senses for text retrieval , 1993, SIGIR.

[6]  Edward A. Fox,et al.  Combining Evidence from Multiple Searches , 1992, TREC.

[7]  W. Bruce Croft,et al.  Term clustering of syntactic phrases , 1989, SIGIR '90.

[8]  Gerard Salton,et al.  The SMART Retrieval System , 1971 .

[9]  Claire Cardie,et al.  An Analysis of Statistical and Syntactic Phrases , 1997, RIAO.

[10]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[11]  John Bear,et al.  Using Information Extraction to Improve Document Retrieval , 1998, TREC.

[12]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[13]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[14]  W. Bruce Croft,et al.  INQUERY System Overview , 1993, TIPSTER.

[15]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[16]  Tomek Strzalkowski,et al.  Natural Language Information Retrieval: TREC-8 Report , 1994, TREC.

[17]  Harry Bunt,et al.  Recent Advances in Parsing Technology , 1996 .

[18]  Chris Buckley,et al.  New Retrieval Approaches Using SMART: TREC 4 , 1995, TREC.

[19]  Naomi Sager,et al.  Natural language information processing , 1980 .

[20]  Paul B. Kantor,et al.  A study of information seeking and retrieving. III. Searchers, searches, and overlap , 1988, J. Am. Soc. Inf. Sci..

[21]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[22]  Nicholas J. Belkin,et al.  Rutgers' TREC-6 Interactive Track Experience , 1997, TREC.

[23]  W. Bruce Croft,et al.  Lexical ambiguity and information retrieval , 1992, TOIS.

[24]  Nicholas J. Belkin,et al.  Rutgers' TREC 2001 Interactive Track Experience , 2001, TREC.

[25]  Tomek Strzalkowski,et al.  Evaluation of TTP Parser: A Preliminary Report , 1993, IWPT.