暂无分享,去创建一个
Shinji Watanabe | Xuankai Chang | Yuya Fujita | Tianzi Wang | Shinji Watanabe | Tianzi Wang | Xuankai Chang | Yuya Fujita
[1] Victor O. K. Li,et al. Non-Autoregressive Neural Machine Translation , 2017, ICLR.
[2] Jindrich Libovický,et al. End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification , 2018, EMNLP.
[3] Shinji Watanabe,et al. Insertion-Based Modeling for End-to-End Automatic Speech Recognition , 2020, INTERSPEECH.
[4] Qian Zhang,et al. Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition , 2020, ArXiv.
[5] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[6] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[7] Hairong Liu,et al. Exploring neural transducers for end-to-end speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[8] Katrin Kirchhoff,et al. Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment , 2020, NAACL.
[9] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[10] Kevin Duh,et al. ORTHROS: non-autoregressive end-to-end speech translation With dual-decoder , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Omer Levy,et al. Mask-Predict: Parallel Decoding of Conditional Masked Language Models , 2019, EMNLP.
[12] Navdeep Jaitly,et al. Imputer: Sequence Modelling via Imputation and Dynamic Programming , 2020, ICML.
[13] Gil Keren,et al. Alignment Restricted Streaming Recurrent Neural Network Transducer , 2021, 2021 IEEE Spoken Language Technology Workshop (SLT).
[14] Tara N. Sainath,et al. Towards Fast and Accurate Streaming End-To-End ASR , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Tatsuya Kawahara,et al. Enhancing Monotonic Multihead Attention for Streaming ASR , 2020, INTERSPEECH.
[16] Shinji Watanabe,et al. Recent Developments on Espnet Toolkit Boosted By Conformer , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Hao Zheng,et al. AISHELL-1: An open-source Mandarin speech corpus and a speech recognition baseline , 2017, 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA).
[18] Colin Raffel,et al. Monotonic Chunkwise Attention , 2017, ICLR.
[19] Tetsunori Kobayashi,et al. Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict , 2020, INTERSPEECH.
[20] Wei Chu,et al. CASS-NAT: CTC Alignment-based Single Step Non-autoregressive Transformer for Speech Recognition , 2020, ArXiv.
[21] Tetsuji Ogawa,et al. Improved Mask-CTC for Non-Autoregressive End-to-End ASR , 2020, ArXiv.
[22] Shinji Watanabe,et al. Listen and Fill in the Missing Letters: Non-Autoregressive Transformer for Speech Recognition , 2019, ArXiv.
[23] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[24] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[25] Yoshua Bengio,et al. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results , 2014, ArXiv.
[26] Jonathan Le Roux,et al. Streaming Automatic Speech Recognition with the Transformer Model , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[28] Kjell Schubert,et al. RNN-T For Latency Controlled ASR With Improved Beam Search , 2019, ArXiv.
[29] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[30] Shinji Watanabe,et al. Streaming Transformer Asr With Blockwise Synchronous Beam Search , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[31] Shuai Zhang,et al. Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition , 2020, INTERSPEECH.
[32] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[33] Paul Deléglise,et al. Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks , 2014, LREC.
[34] Yonghong Yan,et al. Transformer-Based Online CTC/Attention End-To-End Speech Recognition Architecture , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Tara N. Sainath,et al. A Comparison of End-to-End Models for Long-Form Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).