论文信息 - System Description on Automatic Simultaneous Translation Workshop

System Description on Automatic Simultaneous Translation Workshop

This paper shows our submission on the second automatic simultaneous translation workshop at NAACL2021. We participate in all the two directions of Chinese-to-English translation, Chinese audio\rightarrowEnglish text and Chinese text\rightarrowEnglish text. We do data filtering and model training techniques to get the best BLEU score and reduce the average lagging. We propose a two-stage simultaneous translation pipeline system which is composed of Quartznet and BPE-based transformer. We propose a competitive simultaneous translation system and achieves a BLEU score of 24.39 in the audio input track.

[1] Xing Li,et al. STACL: Simultaneous Translation with Integrated Anticipation and Controllable Latency , 2018, ArXiv.

[2] Akshay Krishna Sheshadri,et al. WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm , 2021, ArXiv.

[3] Hua Wu,et al. BSTC: A Large-Scale Chinese-English Speech Translation Dataset , 2021, AUTOSIMTRANS.

[4] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[5] Boris Ginsburg,et al. Jasper: An End-to-End Convolutional Neural Acoustic Model , 2019, INTERSPEECH.

[6] Boris Ginsburg,et al. Quartznet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[8] André F. T. Martins,et al. Marian: Fast Neural Machine Translation in C++ , 2018, ACL.

[9] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[10] Ke Sun,et al. Chinese Lexical Analysis with Deep Bi-GRU-CRF Network , 2018, ArXiv.

[11] Jing Xiao,et al. Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[13] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[14] Hao Zheng,et al. AISHELL-1: An open-source Mandarin speech corpus and a speech recognition baseline , 2017, 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA).