Cross-Lingual Speech-to-Text Summarization

Cross-Lingual Text Summarization generates a summary in a language different from the language of the source documents. We propose a French-to-English cross-lingual transcript summarization framework that automatically segments a French transcript and analyzes the information in the source and the target languages to estimate the saliency of sentences. Additionally, we use a multi-sentence compression method to simultaneously compress and improve the informativeness of sentences. Experimental results show that our framework outperformed extractive methods using automatic sentence segmentation, even with transcription errors.

[1]  Mohammed Atiquzzaman,et al.  Multi-document abstractive summarization using chunk-graph and recurrent neural network , 2017, 2017 IEEE International Conference on Communications (ICC).

[2]  Xiaojun Wan,et al.  Cross-Language Document Summarization Based on Machine Translation Quality Prediction , 2010, ACL.

[3]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[4]  Zygmunt Pizlo,et al.  Automated video program summarization using speech transcripts , 2006, IEEE Transactions on Multimedia.

[5]  Xiaojun Wan,et al.  Using Bilingual Information for Cross-Language Document Summarization , 2011, ACL.

[6]  Ted Briscoe,et al.  Grammatical error correction using neural machine translation , 2016, NAACL.

[7]  Prasenjit Mitra,et al.  Multi-Document Abstractive Summarization Using ILP Based Multi-Sentence Compression , 2015, IJCAI.

[8]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[9]  Anton Leuski,et al.  Cross-lingual C*ST*RD: English access to Hindi information , 2003, TALIP.

[10]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[11]  George Giannakopoulos,et al.  TAC2011 MultiLing Pilot Overview , 2011, TAC.

[12]  Heidi Christensen,et al.  Are extractive text summarisation techniques portable to broadcast news? , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[13]  Juan-Manuel Torres-Moreno,et al.  Sentence Boundary Detection for French with Subword-Level Information Vectors and Convolutional Neural Networks , 2018, ArXiv.

[14]  Xiaojun Wan,et al.  Cross-language document summarization via extraction and ranking of multiple summaries , 2018, Knowledge and Information Systems.

[15]  Julia Hirschberg,et al.  From text to speech summarization , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[16]  Juan-Manuel Torres-Moreno,et al.  Cross-Language Text Summarization Using Sentence and Multi-Sentence Compression , 2018, NLDB.

[17]  Yu Zhou,et al.  Abstractive Cross-Language Summarization via Translation Model Enhanced Predicate Argument Structure Fusing , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[18]  Sadaoki Furui,et al.  Speech-to-text and speech-to-speech summarization of spontaneous speech , 2004, IEEE Transactions on Speech and Audio Processing.

[19]  Michal Rott,et al.  Speech-to-Text Summarization Using Automatic Phrase Extraction from Recognized Text , 2016, TSD.

[20]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[21]  Ales Horák,et al.  Syntactic Analysis Using Finite Patterns: A New Parsing System for Czech , 2009, LTC.

[22]  Juan-Manuel Torres-Moreno,et al.  Multi-Sentence Compression with Word Vertex-Labeled Graphs and Integer Linear Programming , 2018, TextGraphs@NAACL-HLT.

[23]  Juan-Manuel Torres-Moreno,et al.  Automatic Text Summarization: Torres-Moreno/Automatic Text Summarization , 2014 .

[24]  Juan-Manuel Torres-Moreno,et al.  Automatic Text Summarization , 2014 .

[25]  Xiaojun Wan,et al.  Phrase-based Compressive Cross-Language Summarization , 2015, EMNLP.

[26]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[27]  Katja Filippova,et al.  Multi-Sentence Compression: Finding Shortest Paths in Word Graphs , 2010, COLING.

[28]  Mark A. Finlayson,et al.  jMWE: A Java Toolkit for Detecting Multi-Word Expressions , 2011, MWE@ACL.

[29]  Sadaoki Furui,et al.  Automatic Sentence Segmentation of Speech for Automatic Summarization , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[30]  Constantin Orasan,et al.  Evaluation of a Cross-lingual Romanian-English Multi-document Summariser , 2008, LREC.