暂无分享,去创建一个
Tara N. Sainath | Rami Botros | Wei Li | Robert David | Emmanuel Guzman | Yanzhang He | Rami Botros | Yanzhang He | Wei Li | R. David | Emmanuel Guzman
[1] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[2] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.
[3] Tara N. Sainath,et al. Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling , 2019, ArXiv.
[4] Sining Sun,et al. Tiny Transducer: A Highly-Efficient Speech Recognition Model on Edge Devices , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Tara N. Sainath,et al. Recognizing Long-Form Speech Using Streaming End-to-End Models , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[6] Tara N. Sainath,et al. A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Tara N. Sainath,et al. Less is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Qian Zhang,et al. Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Tara N. Sainath,et al. Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Xiaofeng Liu,et al. Rnn-Transducer with Stateless Prediction Network , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Hank Liao,et al. Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[13] Tara N. Sainath,et al. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home , 2017, INTERSPEECH.
[14] Tara N. Sainath,et al. Echo State Speech Recognition , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.
[16] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[17] Yashesh Gaur,et al. On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition , 2020, INTERSPEECH.
[18] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[19] Tara N. Sainath,et al. Lower Frame Rate Neural Network Acoustic Models , 2016, INTERSPEECH.
[20] Cyril Allauzen,et al. Hybrid Autoregressive Transducer (HAT) , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Quoc V. Le,et al. Specaugment on Large Scale Datasets , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[23] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Hagen Soltau,et al. Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.
[25] Tara N. Sainath,et al. A Better and Faster end-to-end Model for Streaming ASR , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[27] Tara N. Sainath,et al. Minimum Word Error Rate Training for Attention-Based Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Yifan Gong,et al. Improving RNN Transducer Modeling for End-to-End Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[29] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[30] Chengzhu Yu,et al. Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition , 2019, INTERSPEECH.
[31] Andreas Stolcke,et al. Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition , 2020, INTERSPEECH.
[32] Alex Graves,et al. Supervised Sequence Labelling , 2012 .
[33] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[34] Fadi Biadsy,et al. Effectively Building Tera Scale MaxEnt Language Models Incorporating Non-Linguistic Signals , 2017, INTERSPEECH.
[35] Yifan Gong,et al. Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).