论文信息 - Natural language information retrieval: progress report

Natural language information retrieval: progress report

Abstract Natural language processing (NLP) techniques may hold a tremendous potential for overcoming the inadequacies of purely quantitative methods of text information retrieval, but the empirical evidence to support such predictions has thus far been inadequate and appropriate scale evaluations have been slow to emerge. In this paper, we report on the progress of the Natural Language Information Retrieval project, a joint effort of several sites led by GE Research and its evaluation the 6th Text Retrieval Conference (TREC-6). In this paper we describe the ‘stream architecture’, a method we designed to combine evidence obtained from several different document representations. Some of the document representations used in the experiments described here involved the use of phrases and proper names computed using Natural Language Processing techniques.

Tomek Strzalkowski | Jose Perez Carballo | T. Strzalkowski

[1] Tomek Strzalkowski,et al. Recent Developments in Natural Language Text Retrieval , 1993, TREC.

[2] Joel L Fagan,et al. Experiments in Automatic Phrase Indexing For Document Retrieval: A Comparison of Syntactic and Non-Syntactic Methods , 1987 .

[3] Douglas E. Appelt,et al. SRI's Tipster II Project , 1996, TIPSTER.

[4] Eric Brill,et al. A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[5] Ellen M. Voorhees,et al. Using WordNet to disambiguate word senses for text retrieval , 1993, SIGIR.

[6] Edward A. Fox,et al. Combining Evidence from Multiple Searches , 1992, TREC.

[7] W. Bruce Croft,et al. Term clustering of syntactic phrases , 1989, SIGIR '90.

[8] Gerard Salton,et al. The SMART Retrieval System , 1971 .

[9] Claire Cardie,et al. An Analysis of Statistical and Syntactic Phrases , 1997, RIAO.

[10] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[11] John Bear,et al. Using Information Extraction to Improve Document Retrieval , 1998, TREC.