论文信息 - Message extraction through estimation of relevance

Message extraction through estimation of relevance

In both the fast-access and the large-volume ends of the information processing spectrum the end user may be called an information analyst. His task as part of an information processing system is to provide the insights and to ask the right questiofis of the system. The computer should perform all of the routine analysis and comparison of documents. The interaction between the analyst and the computer must make it easy for the analyst to ask the necessary questions and to interpret the computer 's response as an answer. Experiments with the METER system have shown that associative retrieval offers a unique capability for obtaining information in response to queries produced by the analyst. In fact, associative methods offer the most assistance precisely in the cases that are difficult to handle in any other way namely, when there is a large amount of unformatted English text. Associative methods are als0 useful when the volume or volatility of the data precludes any detailed knowledge of its contents. In this case the analyst will have to rely on the responses to questions in order to gain any knowledge of specific events. Associative retrieval methods allow an analyst to obtain that kind of information even without knowledge of specifics. With any Boolean or keyword system, an analyst must have a more detailed knowledge of vocabulary in order to obtain a comparable response. The METER system was designed with several goals in mind. Specifically, the system was to exploit the methods of associative retrieval in an effective and inexpensive fashion, and to allow a naive user (that is, someone unfamiliar with the exact content of the database) to access useful information with minimal effort. Our success with these particular goals far exceeded our expectations in the light of the huge amount of research work already completed in the area. The particular implementation we built was expected to keep pace with a database of up to 20 000 messages that arrive continuously at a maximum rate of 4000-5000 per day. The system must have decent response times (one or two minutes) with five simultaneous users, and almost 24 hour access. The system was required to run on a DEC PDP11/45 or 11/70 without special hardware. As a tool for information analysis, the METER system was designed in a

Christopher Landauer | Clinton Mah | Clinton P. Mah | C. Landauer

[1] Aviezri S. Fraenkel,et al. Local Feedback in Full-Text Retrieval Systems , 1977, JACM.

[2] Gerard Salton,et al. A theory of indexing , 1975, Regional conference series in applied mathematics.

[3] Stephen E. Robertson,et al. Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[4] Karen Spärck Jones. Index term weighting , 1973, Inf. Storage Retr..

[5] Gerard Salton,et al. Automatic Information Organization And Retrieval , 1968 .

[6] John B. Goodenough,et al. Contextual correlates of synonymy , 1965, CACM.

[7] C. J. van Rijsbergen,et al. An Evaluation of feedback in Document Retrieval using Co‐Occurrence Data , 1978, J. Documentation.

[8] John M. Morris,et al. RADC On-Line Retrieval System Evaluation. , 1975 .

[9] Van Rijsbergen,et al. A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[10] Nachum Dershowitz,et al. KEDMA—Linguistic Tools for Retrieval Systems , 1978, JACM.

[11] E. B. Newman,et al. Tests of a statistical explanation of the rank-frequency relation for words in written English. , 1958, American Journal of Psychology.

[12] Harry M. Hersh. Statistical methods for technical document retrieval , 1977 .

[13] John W. Tukey,et al. Exploratory Data Analysis. , 1979 .

[14] F. W. Lancaster,et al. Information retrieval: on-line , 1973 .