A Comparison of Automatic Summarizers of Texts in Brazilian Portuguese

Automatic Summarization (AS) in Brazil has only recently become a significant research topic. When compared to other languages initiatives, such a delay can be explained by the lack of specific resources, such as expressive lexicons and corpora that could provide adequate foundations for deep or shallow approaches on AS. Taking advantage of having commonalities with respect to resources and a corpus of texts and summaries written in Brazilian Portuguese, two NLP research groups have decided to start a common task to assess and compare their AS systems. In the experiment five distinct extractive AS systems have been assessed. Some of them incorporate techniques that have been already used to summarize texts in English; others propose novel approaches to AS. Two baseline systems have also been considered. An overall performance comparison has been carried out, and its outcomes are discussed in this paper.

[1]  Alex A. Freitas,et al.  Document Clustering and Text Summarization , 2000 .

[2]  Alex Alves Freitas,et al.  Automatic Text Summarization Using a Machine Learning Approach , 2002, SBIA.

[3]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[4]  Horacio Saggion,et al.  Generating Indicative-Informative Summaries with SumUM , 2002, Computational Linguistics.

[5]  Inderjeet Mani,et al.  Machine Learning of Generic and User-Focused Summarization , 1998, AAAI/IAAI.

[6]  T. Kohonen Self-Organized Formation of Correct Feature Maps , 1982 .

[7]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[8]  Marc Moens,et al.  Articles Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status , 2002, CL.

[9]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[10]  Sandra M. Aluísio,et al.  Combining Classifiers to Improve Part of Speech Tagging: A Case Study for Brazilian Portuguese , 2000, IBERAMIA-SBIA 2000 Open Discussion Track.

[11]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[12]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[13]  P. Schönemann On artificial intelligence , 1985, Behavioral and Brain Sciences.

[14]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[15]  Wai Lam,et al.  Evaluation Challenges in Large-Scale Document Summarization , 2003, ACL.

[16]  Maria das Graças Volpe Nunes,et al.  GistSumm: A Summarization Tool Based on a New Extractive Method , 2003, PROPOR.

[17]  Mark T. Maybury,et al.  Advances in Automatic Text Summarization , 1999 .

[18]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[19]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.