Multilingual Statistical News Summarisation: Preliminary Experiments with English

In this paper we present a generic approach for summarising multilingual news clusters such as the ones produced by the Europe Media Monitor (EMM) system. It is generic because it uses robust statistical techniques to perform the summarisation step and its multilinguality is inherited from the multilingual entity disambiguation system used to build the source representation. We ran preliminary experiments with the TAC 2008 data, an English corpus for summarisation research, and we obtained promising improvements over a summarisation system ranked in the top 20% at the TAC 2008 competition.

[1]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[2]  Marc Dymetman,et al.  Automatic Construction of Multilingual Name Dictionaries , 2009 .

[3]  Elizabeth D. Liddy,et al.  Advances in Automatic Text Summarization , 2001, Information Retrieval.

[4]  Lynette Hirschman,et al.  Appendix F: MUC-7 Coreference Task Definition (version 3.0) , 1998, MUC.

[5]  Mark T. Maybury,et al.  Generating Summaries from Event Data , 1995, Inf. Process. Manag..

[6]  Karel Jezek,et al.  Two uses of anaphora resolution in summarization , 2007, Inf. Process. Manag..

[7]  Karel Jezek,et al.  SUTLER: Update Summarizer Based on Latent Topics , 2008, TAC.

[8]  Karen Spärck Jones Automatic summarising: factors and directions , 1998, ArXiv.

[9]  Erik Van der Goot,et al.  Near real time information mining in multilingual news , 2009, WWW '09.

[10]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[11]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[12]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[13]  Steinberger Ralf,et al.  Using Language-independent Rules to Achieve High Multilinguality in Text Mining , 2008 .

[14]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[15]  Karel Jezek,et al.  Text Summarization and Singular Value Decomposition , 2004, ADVIS.

[16]  Dragomir R. Radev,et al.  Generating summaries of multiple news articles , 1995, SIGIR '95.

[17]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[18]  Xin Liu,et al.  Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[19]  Bruno Pouliquen,et al.  Geocoding Multilingual Texts: Recognition, Disambiguation and Visualisation , 2006, LREC.

[20]  Steinberger Ralf,et al.  Automatic Construction of Multilingual Name Dictionaries , 2009 .

[21]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .