论文信息 - Online Sentence Segmentation for Simultaneous Interpretation using Multi-Shifted Recurrent Neural Network - 字舞流文

Online Sentence Segmentation for Simultaneous Interpretation using Multi-Shifted Recurrent Neural Network

This paper is devoted to developing a recurrent neural network (RNN) solution for segmenting the unpunctuated transcripts generated by automatic speech recognition for simultaneous interpretation. RNNs are effective in capturing long-distance dependencies and straightforward for online decoding. Thus, they are ideal for the task compared to the conventional n-gram language model (LM) based approaches and recent neural machine translation based approaches. This paper proposes a multishifted RNN to address the trade-off between accuracy and latency, which is one of the key characteristics of the task. Experiments show that our proposed method improves the segmentation accuracy measured in F1 by 21.1% while maintains approximately the same latency, and reduces the BLEU loss to the oracle segmentation by 28.6%, when compared to a strong baseline of the RNN LM-based method. Our online sentence segmentation toolkit is open-sourced1 to promote the field.

Xiaolin Wang | Masao Utiyama | Eiichiro Sumita | M. Utiyama | E. Sumita | Xiaolin Wang

[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[2] Lawrence Venuti. The Translation Studies Reader , 2000 .

[3] Gökhan Tür,et al. Automatic detection of sentence boundaries and disfluencies based on recognized words , 1998, ICSLP.

[4] Tomoki Toda,et al. Optimizing Segmentation Strategies for Simultaneous Speech Translation , 2014, ACL.

[5] Gökhan Tür,et al. Segmentation and disfluency removal for conversational speech translation , 2014, INTERSPEECH.

[6] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[7] Jörg Tiedemann,et al. OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.

[8] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[9] Jan Niehues,et al. The IWSLT 2015 Evaluation Campaign , 2015, IWSLT.

[10] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[11] Hermann Ney,et al. Evaluating Machine Translation Output with Automatic Sentence Segmentation , 2005, IWSLT.

[12] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[13] Srinivas Bangalore,et al. Real-time Incremental Speech-to-Speech Translation of Dialogs , 2012, NAACL.

[14] Alexander H. Waibel,et al. Simultaneous translation of lectures and speeches , 2007, Machine Translation.

[15] Xiaolin Wang,et al. An Efficient and Effective Online Sentence Segmenter for Simultaneous Interpretation , 2016, WAT@COLING.

[16] Tomoki Toda,et al. Simple, lexicalized choice of translation timing for simultaneous speech translation , 2013, INTERSPEECH.

[17] Jan Niehues,et al. The KIT translation systems for IWSLT 2015 , 2015, IWSLT.

[18] Jan Niehues,et al. Punctuation insertion for real-time spoken language translation , 2017, IWSLT.

[19] Srinivas Bangalore,et al. Incremental Segmentation and Decoding Strategies for Simultaneous Translation , 2013, IJCNLP.

[20] Peter Bell,et al. Sequence-to-sequence models for punctuated transcription combining lexical and acoustic features , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21] Peter Bell,et al. Punctuated transcription of multi-genre broadcasts using acoustic and lexical approaches , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[22] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.

[23] Alexandra Birch,et al. The Samsung and University of Edinburgh’s submission to IWSLT17 , 2017, IWSLT.

[24] Andreas Stolcke,et al. Automatic linguistic segmentation of conversational speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[25] Eric G. Hansen,et al. The MITLL-AFRL IWSLT 2016 Systems , 2016, IWSLT.

[26] A. Waibel,et al. KIT’s Multilingual Neural Machine Translation systems for IWSLT 2017 , 2017, IWSLT.

[27] Jörg Tiedemann,et al. Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[28] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[29] A. Waibel,et al. Adaptation and Combination of NMT Systems: The KIT Translation Systems for IWSLT 2016 , 2016, IWSLT.

[30] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.