E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model
暂无分享,去创建一个
Tara N. Sainath | Rohit Prabhavalkar | David Rybach | Cyril Allauzen | Trevor Strohman | W. R. Huang | Yanzhang He | Shuo-yiin Chang | Cal Peyser | R. David | W. R. Huang
[1] Tara N. Sainath,et al. Turn-Taking Prediction for Natural Conversational Speech , 2022, INTERSPEECH.
[2] Tara N. Sainath,et al. Improving The Latency And Quality Of Cascaded Encoders , 2022, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Tara N. Sainath,et al. E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR , 2022, INTERSPEECH.
[4] Tara N. Sainath,et al. A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes , 2022, INTERSPEECH.
[5] M. Seltzer,et al. Streaming parallel transducer beam search with fast-slow cascaded encoders , 2022, INTERSPEECH.
[6] Rohit Prabhavalkar,et al. Dissecting User-Perceived Latency of On-Device E2E Speech Recognition , 2021, Interspeech.
[7] Tara N. Sainath,et al. Less is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Tara N. Sainath,et al. A Better and Faster end-to-end Model for Streaming ASR , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Tara N. Sainath,et al. Cascaded Encoders for Unifying Streaming and Non-Streaming ASR , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Tara N. Sainath,et al. RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[11] Lei Xie,et al. Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition , 2020, ArXiv.
[12] Tara N. Sainath,et al. Low Latency Speech Recognition Using End-to-End Prefetching , 2020, INTERSPEECH.
[13] K. Takeda,et al. End-to-End Automatic Speech Recognition Integrated with CTC-Based Voice Activity Detection , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Hagen Soltau,et al. Monotonic Recurrent Neural Network Transducer and Decoding Strategies , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[15] Tara N. Sainath,et al. A Comparison of End-to-End Models for Long-Form Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[16] Tara N. Sainath,et al. Recognizing Long-Form Speech Using Streaming End-to-End Models , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[17] Tara N. Sainath,et al. Two-Pass End-to-End Speech Recognition , 2019, INTERSPEECH.
[18] Arun Narayanan,et al. Toward Domain-Invariant Speech Recognition via Large Scale Training , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[19] Hagen Soltau,et al. Reducing the computational complexity for whole word models , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[20] Tara N. Sainath,et al. Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection , 2016, INTERSPEECH.
[21] Juan Manuel Górriz,et al. Voice Activity Detection. Fundamentals and Speech Recognition System Robustness , 2007 .