Cross-Language Text Summarization Using Sentence and Multi-Sentence Compression

Cross-Language Automatic Text Summarization produces a summary in a language different from the language of the source documents. In this paper, we propose a French-to-English cross-lingual summarization framework that analyzes the information in both languages to identify the most relevant sentences. In order to generate more informative cross-lingual summaries, we introduce the use of chunks and two compression methods at the sentence and multi-sentence levels. Experimental results on the MultiLing 2011 dataset show that our framework improves the results obtained by state-of-the art approaches according to ROUGE metrics.

[1]  Mark A. Finlayson,et al.  jMWE: A Java Toolkit for Detecting Multi-Word Expressions , 2011, MWE@ACL.

[2]  Fei Liu,et al.  Document Summarization via Guided Sentence Compression , 2013, EMNLP.

[3]  Constantin Orasan,et al.  Evaluation of a Cross-lingual Romanian-English Multi-document Summariser , 2008, LREC.

[4]  Xiaojun Wan,et al.  Cross-Language Document Summarization Based on Machine Translation Quality Prediction , 2010, ACL.

[5]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[6]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[7]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[8]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[9]  Juan-Manuel Torres-Moreno,et al.  Multi-Sentence Compression with Word Vertex-Labeled Graphs and Integer Linear Programming , 2018, TextGraphs@NAACL-HLT.

[10]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.

[11]  Katja Filippova,et al.  Multi-Sentence Compression: Finding Shortest Paths in Word Graphs , 2010, COLING.

[12]  Yu Zhou,et al.  Abstractive Cross-Language Summarization via Translation Model Enhanced Predicate Argument Structure Fusing , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13]  Florian Boudin,et al.  A Graph-based Approach to Cross-language Multi-document Summarization , 2011, Polibits.

[14]  Anton Leuski,et al.  Cross-lingual C*ST*RD: English access to Hindi information , 2003, TALIP.

[15]  Xiaojun Wan,et al.  Compressive Document Summarization via Sparse Optimization , 2015, IJCAI.

[16]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[17]  Yang Liu,et al.  Fast Joint Compression and Summarization via Graph Cuts , 2013, EMNLP.

[18]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[19]  Juan-Manuel Torres-Moreno,et al.  Automatic Text Summarization: Torres-Moreno/Automatic Text Summarization , 2014 .

[20]  George Giannakopoulos,et al.  TAC2011 MultiLing Pilot Overview , 2011, TAC.

[21]  Mohammed Atiquzzaman,et al.  Multi-document abstractive summarization using chunk-graph and recurrent neural network , 2017, 2017 IEEE International Conference on Communications (ICC).

[22]  Prasenjit Mitra,et al.  Multi-Document Abstractive Summarization Using ILP Based Multi-Sentence Compression , 2015, IJCAI.

[23]  Xiaojun Wan,et al.  Phrase-based Compressive Cross-Language Summarization , 2015, EMNLP.

[24]  Lin Zhao,et al.  Improving Multi-documents Summarization by Sentence Compression based on Expanded Constituent Parse Trees , 2014, EMNLP.

[25]  Xiaojun Wan,et al.  Using Bilingual Information for Cross-Language Document Summarization , 2011, ACL.

[26]  Juan-Manuel Torres-Moreno,et al.  Métodos de Otimização Combinatória Aplicados ao Problema de Compressão MultiFrases , 2017, ArXiv.

[27]  Anagha Shamprasad,et al.  Automatic text summarization , 2019 .