论文信息 - Automatic Text Document Summarization Based on Machine Learning

Automatic Text Document Summarization Based on Machine Learning

The need for automatic generation of summaries gained importance with the unprecedented volume of information available in the Internet. Automatic systems based on extractive summarization techniques select the most significant sentences of one or more texts to generate a summary. This article makes use of Machine Learning techniques to assess the quality of the twenty most referenced strategies used in extractive summarization, integrating them in a tool. Quantitative and qualitative aspects were considered in such assessment demonstrating the validity of the proposed scheme. The experiments were performed on the CNN-corpus, possibly the largest and most suitable test corpus today for benchmarking extractive summarization strategies.

[1] Elena Lloret,et al. Text summarisation in progress: a literature review , 2011, Artificial Intelligence Review.

[2] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[3] Elena Lloret,et al. COMPENDIUM: a text summarisation tool for generating summaries of multiple purposes, domains, and genres , 2012, Natural Language Engineering.

[4] D. Kibler,et al. Instance-based learning algorithms , 2004, Machine Learning.

[5] S. Hyakin,et al. Neural Networks: A Comprehensive Foundation , 1994 .

[6] Pat Langley,et al. Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[7] Uma Shanker Tiwary,et al. A language independent approach to multilingual text summarization , 2007 .

[8] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[9] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[10] L. R. Rasmussen,et al. In information retrieval: data structures and algorithms , 1992 .

[11] Bernardete Ribeiro,et al. The importance of stop word removal on recall values in text categorization , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[12] George D. C. Cavalcanti,et al. Assessing sentence scoring techniques for extractive text summarization , 2013, Expert Syst. Appl..

[13] Leo Breiman,et al. Random Forests , 2001, Machine Learning.