暂无分享,去创建一个
Kenneth Ward Church | Jiahong Yuan | Renjie Zheng | Liang Huang | Kenneth Church | Xingyu Cai | Dongji Gao | Jiahong Yuan | Renjie Zheng | Liang Huang | Dongji Gao | Xingyu Cai | Kenneth Church
[1] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[2] Peter Bell,et al. Stochastic Attention Head Removal: A Simple and Effective Method for Improving Transformer Based ASR Models , 2021, Interspeech.
[3] Haizhou Li,et al. Self-and-Mixed Attention Decoder with Deep Acoustic Structure for Transformer-based LVCSR , 2020, INTERSPEECH.
[4] Lei Xie,et al. WeNet: Production Oriented Streaming and Non-Streaming End-to-End Speech Recognition Toolkit , 2021, Interspeech.
[5] Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search , 2021, ArXiv.
[6] Ning Cheng,et al. Applying wav2vec2.0 to Speech Recognition in various low-resource languages , 2020, ArXiv.
[7] Boris Ginsburg,et al. Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition , 2021, 2104.01721.
[8] Lei Xie,et al. Towards Language-Universal Mandarin-English Speech Recognition , 2019, INTERSPEECH.
[9] Wei Chen,et al. WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition , 2021, ArXiv.
[10] Michael Picheny,et al. New methods in continuous Mandarin speech recognition , 1997, EUROSPEECH.
[11] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[12] Lujun Li,et al. Adversarial joint training with self-attention mechanism for robust end-to-end speech recognition , 2021, EURASIP Journal on Audio, Speech, and Music Processing.
[13] Lei Xie,et al. Boundary and Context Aware Training for CIF-Based Non-Autoregressive End-to-End ASR , 2021, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[14] Shuai Zhang,et al. One in a hundred: Select the best predicted sequence from numerous candidates for streaming speech recognition , 2020 .
[15] William Chan,et al. On Online Attention-Based Speech Recognition and Joint Mandarin Character-Pinyin Training , 2016, INTERSPEECH.
[16] Chng Eng Siong,et al. Independent Language Modeling Architecture for End-To-End ASR , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Abdel-rahman Mohamed,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[18] Hao Zheng,et al. AISHELL-1: An open-source Mandarin speech corpus and a speech recognition baseline , 2017, 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA).
[19] Tara N. Sainath,et al. Bytes Are All You Need: End-to-end Multilingual Speech Recognition and Synthesis with Bytes , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Chao Huang,et al. Large vocabulary Mandarin speech recognition with different approaches in modeling tones , 2000, INTERSPEECH.
[21] Chao Weng,et al. Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[23] Shiliang Zhang,et al. Simplified Self-Attention for Transformer-Based end-to-end Speech Recognition , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[24] Kenneth Ward Church,et al. Automatic recognition of suprasegmentals in speech , 2021, ArXiv.
[25] Kuan-Yu Chen,et al. Non-autoregressive Transformer-based End-to-end ASR using BERT , 2021, ArXiv.
[26] Jing Xiao,et al. Multi-Quartznet: Multi-Resolution Convolution for Speech Recognition with Multi-Layer Feature Fusion , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[27] Xiao-Lei Zhang,et al. Efficient conformer-based speech recognition with linear attention , 2021, 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[28] Lei Xie,et al. Efficient Gradient-Based Neural Architecture Search For End-to-End ASR , 2021, ICMI Companion.
[29] Lei Xie,et al. Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition , 2020, ArXiv.
[30] Jianhua Tao,et al. Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[31] Lei Xie,et al. Cascade RNN-Transducer: Syllable Based Streaming On-Device Mandarin Speech Recognition with a Syllable-To-Character Converter , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[32] Menglong Xu,et al. Transformer-based end-to-end speech recognition with residual Gaussian-based self-attention , 2021, Interspeech 2021.
[33] Xiangang Li,et al. A Further Study of Unsupervised Pretraining for Transformer Based Speech Recognition , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] J. Tao,et al. Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition , 2020, INTERSPEECH.
[35] Wen Wang,et al. Building A Highly Accurate Mandarin Speech Recognizer With Language-Independent Technologies and Language-Dependent Modules , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[36] Xiangang Li,et al. A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition , 2013, Neurocomputing.
[37] Kenneth Ward Church,et al. Speech Emotion Recognition with Multi-Task Learning , 2021, Interspeech.
[38] FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition , 2021, ArXiv.
[39] Mohan Li,et al. Transformer-Based Online Speech Recognition with Decoder-end Adaptive Computation Steps , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[40] Tatsuya Komatsu,et al. Relaxing the Conditional Independence Assumption of CTC-based ASR by Conditioning on Intermediate Predictions , 2021, Interspeech.
[41] Jun Zhang,et al. Improving RNN transducer with normalized jointer network , 2020, ArXiv.
[42] Weibin Zhang,et al. Multi-head Monotonic Chunkwise Attention For Online Speech Recognition , 2020, ArXiv.
[43] Shuai Zhang,et al. Decoupling Pronunciation and Language for End-to-End Code-Switching Automatic Speech Recognition , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] Mari Ostendorf,et al. Modeling lexical tones for mandarin large vocabulary continuous speech recognition , 2006 .
[45] Shiliang Zhang,et al. Investigation of Modeling Units for Mandarin Speech Recognition Using Dfsmn-ctc-smbr , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[46] Shuang Xu,et al. A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese , 2018, ICONIP.
[47] Li Fu,et al. Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition , 2020, ArXiv.
[48] Shouyi Yin,et al. Transformer with Bidirectional Decoder for Speech Recognition , 2020, INTERSPEECH.
[49] Shinji Watanabe,et al. Intermediate Loss Regularization for CTC-Based Speech Recognition , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[50] Rama Doddipatla,et al. Head-Synchronous Decoding for Transformer-Based Streaming ASR , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).