暂无分享,去创建一个
Tara N. Sainath | Arun Narayanan | Bo Li | Yu Zhang | Jiahui Yu | Chung-Cheng Chiu | Yonghui Wu | Ruoming Pang | Trevor Strohman | Qiao Liang | Anmol Gulati | Yanzhang He | Shuo-Yiin Chang | James Qin | Wei Han | Yonghui Wu | C. Chiu | Jiahui Yu | Trevor Strohman | Yanzhang He | Yu Zhang | James Qin | Anmol Gulati | Ruoming Pang | A. Narayanan | Shuo-yiin Chang | Qiao Liang | Wei Han | Bo Li
[1] Jonathan Le Roux,et al. Triggered Attention for End-to-end Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Tara N. Sainath,et al. Emitting Word Timings with End-to-End Models , 2020, INTERSPEECH.
[3] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[4] Tara N. Sainath,et al. Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Tara N. Sainath,et al. Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling , 2019, ArXiv.
[6] Tara N. Sainath,et al. Towards Fast and Accurate Streaming End-To-End ASR , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Geoffrey Zweig,et al. Transformer-Based Acoustic Modeling for Hybrid Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Mike Schuster,et al. Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Tara N. Sainath,et al. Lower Frame Rate Neural Network Acoustic Models , 2016, INTERSPEECH.
[10] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[11] Fabrizio Silvestri,et al. Prefetching query results and its impact on search engines , 2012, SIGIR '12.
[12] Tara N. Sainath,et al. A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Chengyi Wang,et al. Reducing the Latency of End-to-End Streaming Speech Recognition Models with a Scout Network , 2020 .
[15] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[16] Tara N. Sainath,et al. Shallow-Fusion End-to-End Contextual Biasing , 2019, INTERSPEECH.
[17] Tara N. Sainath,et al. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home , 2017, INTERSPEECH.
[18] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Tara N. Sainath,et al. Recognizing Long-Form Speech Using Streaming End-to-End Models , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[20] Fadi Biadsy,et al. Effectively Building Tera Scale MaxEnt Language Models Incorporating Non-Linguistic Signals , 2017, INTERSPEECH.
[21] Gang Liu,et al. An Online Attention-based Model for Speech Recognition , 2018, INTERSPEECH.
[22] Samy Bengio,et al. An Online Sequence-to-Sequence Model Using Partial Conditioning , 2015, NIPS.
[23] Chengyi Wang,et al. Low Latency End-to-End Streaming Speech Recognition with a Scout Network , 2020, INTERSPEECH.
[24] Yifan Gong,et al. Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[25] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[26] Yashesh Gaur,et al. Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[28] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[29] Luiz André Barroso,et al. Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.
[30] Yashesh Gaur,et al. On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition , 2020, INTERSPEECH.
[31] Colin Raffel,et al. Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.
[32] Hank Liao,et al. Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[33] Tara N. Sainath,et al. Low Latency Speech Recognition Using End-to-End Prefetching , 2020, INTERSPEECH.
[34] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[35] Kaiming He,et al. Group Normalization , 2018, ECCV.
[36] Shinji Watanabe,et al. Streaming Transformer ASR with Blockwise Synchronous Inference , 2020, ArXiv.
[37] Tara N. Sainath,et al. Two-Pass End-to-End Speech Recognition , 2019, INTERSPEECH.
[38] Tara N. Sainath,et al. FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization , 2020, ArXiv.
[39] Tara N. Sainath,et al. A Comparison of End-to-End Models for Long-Form Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).