暂无分享,去创建一个
[1] Philip C. Woodland,et al. Integrating Source-Channel and Attention-Based Sequence-to-Sequence Models for Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[2] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[3] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Tara N. Sainath,et al. Deep Context: End-to-end Contextual Speech Recognition , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[5] Chanwoo Kim,et al. Hierarchical Multi-Stage Word-to-Grapheme Named Entity Corrector for Automatic Speech Recognition , 2020, INTERSPEECH.
[6] Yifan Gong,et al. Improving RNN Transducer Modeling for End-to-End Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[7] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Geoffrey Zweig,et al. Contextualizing ASR Lattice Rescoring with Hybrid Pointer Network Language Model , 2020, INTERSPEECH.
[10] Navdeep Jaitly,et al. Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.
[11] Steve Renals,et al. A study of the recurrent neural network encoder-decoder for large vocabulary speech recognition , 2015, INTERSPEECH.
[12] Tara N. Sainath,et al. Contextual Speech Recognition with Difficult Negative Training Examples , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Yongqiang Wang,et al. Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR , 2019, INTERSPEECH.
[14] Yingbo Zhou,et al. Fast and Robust Unsupervised Contextual Biasing for Speech Recognition , 2020, ArXiv.
[15] Tomohiro Nakatani,et al. SEQUENCE TRAINING OF ENCODER-DECODER MODEL USING POLICY GRADIENT FOR END- TO-END SPEECH RECOGNITION , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Gil Keren,et al. Contextual RNN-T For Open Domain ASR , 2020, INTERSPEECH.
[17] Hairong Liu,et al. Exploring neural transducers for end-to-end speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[18] Tomoki Toda,et al. Back-Translation-Style Data Augmentation for end-to-end ASR , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[19] Jay Mahadeokar,et al. Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Tara N. Sainath,et al. A Comparison of Sequence-to-Sequence Models for Speech Recognition , 2017, INTERSPEECH.
[21] Weidong Li,et al. Knowledge graph based natural language generation with adapted pointer-generator networks , 2020, Neurocomputing.
[22] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[23] Cyril Allauzen,et al. Hybrid Autoregressive Transducer (HAT) , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Ariya Rastrow,et al. Contextual Language Model Adaptation for Conversational Agents , 2018, INTERSPEECH.
[25] Hermann Ney,et al. Improved training of end-to-end attention models for speech recognition , 2018, INTERSPEECH.
[26] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[27] Sanjeev Khudanpur,et al. End-to-end Speech Recognition Using Lattice-free MMI , 2018, INTERSPEECH.
[28] Tara N. Sainath,et al. Shallow-Fusion End-to-End Contextual Biasing , 2019, INTERSPEECH.
[29] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[30] Gokhan Tur,et al. Joint Contextual Modeling for ASR Correction and Language Understanding , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Xiaofeng Liu,et al. Rnn-Transducer with Stateless Prediction Network , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] John Paul Shen,et al. Audio-visual TED corpus: enhancing the TED-LIUM corpus with facial information, contextual text and object recognition , 2019, UbiComp/ISWC Adjunct.
[33] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[34] Tara N. Sainath,et al. Contextual Speech Recognition in End-to-end Neural Network Systems Using Beam Search , 2018, INTERSPEECH.
[35] Shinji Watanabe,et al. End-to-end Speech Recognition With Word-Based Rnn Language Models , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[36] Hermann Ney,et al. CTC in the Context of Generalized Full-Sum HMM Training , 2017, INTERSPEECH.
[37] Naoyuki Kanda,et al. Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[38] Tsuyoshi Usagawa,et al. Contextual keyword spotting in lecture video with deep convolutional neural network , 2017, 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS).
[39] Naoyuki Kanda,et al. Internal Language Model Training for Domain-Adaptive End-To-End Speech Recognition , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[40] Ying Zhang,et al. Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks , 2016, INTERSPEECH.
[41] Gil Keren,et al. Deep Shallow Fusion for RNN-T Personalization , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[42] Nancy F. Chen,et al. Topic-Aware Pointer-Generator Networks for Summarizing Spoken Conversations , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[43] Gunnar Evermann,et al. Class LM and word mapping for contextual biasing in End-to-End ASR , 2020, INTERSPEECH.
[44] Qi Liu,et al. Modular End-to-End Automatic Speech Recognition Framework for Acoustic-to-Word Model , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[45] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[46] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[47] Yongqiang Wang,et al. End-to-end Contextual Speech Recognition Using Class Language Models and a Token Passing Decoder , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] Tara N. Sainath,et al. An Analysis of Incorporating an External Language Model into a Sequence-to-Sequence Model , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Gil Keren,et al. Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion , 2021, Interspeech 2021.
[50] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[51] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.