暂无分享,去创建一个
Tara N. Sainath | Ke Hu | Tara Sainath | Weiran Wang | Ke Hu | Weiran Wang
[1] Tara N. Sainath,et al. Recognizing Long-Form Speech Using Streaming End-to-End Models , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[2] Tara N. Sainath,et al. Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Tara N. Sainath,et al. Tied & Reduced RNN-T Decoder , 2021, Interspeech.
[4] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[5] Yifan Gong,et al. Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[6] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[7] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Tara N. Sainath,et al. A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Tara N. Sainath,et al. Deliberation Model Based Two-Pass End-To-End Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Tetsuji Ogawa,et al. Improved Mask-CTC for Non-Autoregressive End-to-End ASR , 2020, ArXiv.
[11] Tara N. Sainath,et al. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home , 2017, INTERSPEECH.
[12] Katrin Kirchhoff,et al. Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment , 2020, NAACL.
[13] Tetsunori Kobayashi,et al. Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict , 2020, INTERSPEECH.
[14] Tara N. Sainath,et al. Cascaded Encoders for Unifying Streaming and Non-Streaming ASR , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Tara N. Sainath,et al. Transformer Based Deliberation for Two-Pass Speech Recognition , 2021, 2021 IEEE Spoken Language Technology Workshop (SLT).
[16] Shinji Watanabe,et al. Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models , 2021, Interspeech.
[17] Nenghai Yu,et al. Deliberation Networks: Sequence Generation Beyond One-Pass Decoding , 2017, NIPS.
[18] Shinji Watanabe,et al. Listen and Fill in the Missing Letters: Non-Autoregressive Transformer for Speech Recognition , 2019, ArXiv.
[19] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[20] Tara N. Sainath,et al. Lower Frame Rate Neural Network Acoustic Models , 2016, INTERSPEECH.
[21] Omer Levy,et al. Mask-Predict: Parallel Decoding of Conditional Masked Language Models , 2019, EMNLP.
[22] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[23] Navdeep Jaitly,et al. Imputer: Sequence Modelling via Imputation and Dynamic Programming , 2020, ICML.
[24] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[25] Quoc V. Le,et al. Specaugment on Large Scale Datasets , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.