Combining a Multi-Document Update Summarization System –CBSEAS– with a Genetic Algorithm

In this paper, we present a combination of a multi-document summarization system with a genetic algorithm. We first introduce a novel approach for automatic summarization. CBSEAS, the system which implements this approach, integrates a new method to detect redundancy at its very core in order to produce summaries with a good informational diversity. However, the evaluation of our system at TAC 2008—Text Analysis Conference—revealed that system adaptation to a specific domain is fundamental to obtain summaries of an acceptable quality.

[1]  Jianfeng Gao,et al.  An Information-Theoretic Approach to Automatic Evaluation of Summaries , 2006, NAACL.

[2]  Mary Ellen Okurowski,et al.  Trainable, Scalable Summarization Using Robust NLP and Machine Learning , 1998, ACL.

[3]  Hoa Trang Dang,et al.  Overview of the TAC 2008 Update Summarization Task , 2008, TAC.

[4]  Dragomir R. Radev,et al.  LexRank: Graph-based Centrality as Salience in Text Summarization , 2004 .

[5]  Aurélien Bossard Using document structure for automatic summarization , 2009, SIGIR.

[6]  Wai Lam,et al.  MEAD - A Platform for Multidocument Multilingual Text Summarization , 2004, LREC.

[7]  Edward A. Fox,et al.  Digital Libraries: People, Knowledge, and Technology , 2002, Lecture Notes in Computer Science.

[8]  Thierry Poibeau,et al.  Integrating Document Structure into a Multi-Document Summarizer , 2009, RANLP.

[9]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[10]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[11]  Wei-Pang Yang,et al.  Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis , 2002, ICADL.

[12]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[13]  Miles Osborne,et al.  Using maximum entropy for sentence extraction , 2002, ACL 2002.

[14]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[15]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[16]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[17]  Andrew Goldberg CS 838-1 Advanced NLP : Automatic Summarization , 2007 .

[18]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[19]  Michel Généreux,et al.  Description of the LIPN Systems at TAC 2008: Summarizing Information and Opinions , 2008, TAC.

[20]  Wei-Pang Yang,et al.  Text summarization using a trainable summarizer and latent semantic analysis , 2005, Inf. Process. Manag..

[21]  H. P. Edmundson,et al.  Automatic abstracting and indexing—survey and recommendations , 1961, CACM.

[22]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[23]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[24]  Dragomir R. Radev,et al.  Multi Document Centroid-based Text Summarization , 2002, ACL 2002.

[25]  Marc Moens,et al.  Articles Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status , 2002, CL.

[26]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[27]  Michael Gamon,et al.  The PYTHY Summarization System: Microsoft Research at DUC 2007 , 2007 .