A trainable document summarizer

●● ● ● ● To summarize is to reduce in complexity, and hence in length, while retaining some of the essential qualities of the original. This paper focusses on document extracts, a particular kind of computed document summary. Document extracts consisting of roughly 20% of the original cart be as informative as the full text of a document, which suggests that even shorter extracts may be useful indicative summmies. The trends in our results are in agreement with those of Edmundson who used a subjectively weighted combination of features as opposed to training the feature weights using a corpus.

[1]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[2]  Phyllis B. Baxendale,et al.  Machine-Made Index for Technical Literature - An Experiment , 1958, IBM J. Res. Dev..

[3]  Gustave J. Rath,et al.  The formation of abstracts by the selection of sentences , 1961 .

[4]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[5]  E. F. Skorochod'ko Adaptive Method of Automatic Abstracting and Indexing , 1971, IFIP Congress.

[6]  J. I. Tait,et al.  Generating summaries using a script-based language analyser , 1985 .

[7]  Udo Hahn,et al.  Text condensation as knowledge base abstraction , 1988, [1988] Proceedings. The Fourth Conference on Artificial Intelligence Applications.

[8]  Lisa F. Rau,et al.  SCISOR: extracting information from on-line news , 1990, CACM.

[9]  Christoph Schwarz Content based text handling , 1990, Inf. Process. Manag..

[10]  Chris D. Paice,et al.  Constructing literature abstracts by computer: Techniques and prospects , 1990, Inf. Process. Manag..

[11]  Francine R. Chen,et al.  The use of emphasis to automatically summarize a spoken discourse , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  George M. Kasper,et al.  The Effects and Limitations of Automated Text Condensing on Reading Comprehension Performance , 1992, Inf. Syst. Res..

[13]  James Allan,et al.  Approaches to passage retrieval in full text information systems , 1993, SIGIR.

[14]  Karen Sparck Jones Discourse modelling for automatic summarising , 1993 .

[15]  Chris D. Paice,et al.  The identification of important concepts in highly structured technical papers , 1993, SIGIR.

[16]  Seiji Miike,et al.  A full-text retrieval system with a dynamic abstract generation function , 1994, SIGIR '94.

[17]  G Salton,et al.  Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts , 1994, Science.