Joint Endpointing and Decoding with End-to-end Models
暂无分享,去创建一个
Tara N. Sainath | Rohit Prabhavalkar | Tara N. Sainath | Gabor Simko | Yanzhang He | Shuo-Yiin Chang | Rohit Prabhavalkar | Yanzhang He | Shuo-yiin Chang | Gabor Simko
[1] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Tara N. Sainath,et al. Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection , 2016, INTERSPEECH.
[3] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[5] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[6] Tara N. Sainath,et al. Endpoint Detection Using Grid Long Short-Term Memory Networks for Streaming Speech Recognition , 2017, INTERSPEECH.
[7] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[8] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[9] Matt Shannon,et al. Improved End-of-Query Detection for Streaming Speech Recognition , 2017, INTERSPEECH.
[10] Tara N. Sainath,et al. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home , 2017, INTERSPEECH.
[11] Tara N. Sainath,et al. Temporal Modeling Using Dilated Convolution and Gating for Voice-Activity-Detection , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[14] Yun Lei,et al. All for one: feature combination for highly channel-degraded speech activity detection , 2013, INTERSPEECH.
[15] Roland Maas,et al. Combining Acoustic Embeddings and Decoding Features for End-of-Utterance Detection in Real-Time Far-Field Speech Recognition Systems , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Brian Kingsbury,et al. Improvements to the IBM speech activity detection system for the DARPA RATS program , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Rohit Prabhavalkar,et al. Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).