暂无分享,去创建一个
Jinsong Zhang | Zhijian Ou | Huahuan Zheng | Wenjie Peng | Jinsong Zhang | Huahuan Zheng | Zhijian Ou | Wenjie Peng
[1] Sanjeev Khudanpur,et al. A Parallelizable Lattice Rescoring Strategy with Neural Language Models , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Hermann Ney,et al. Language Modeling with Deep Transformers , 2019, INTERSPEECH.
[3] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[4] Geoffrey Zweig,et al. Transformer-Based Acoustic Modeling for Hybrid Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Rohit Prabhavalkar,et al. Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[6] Quoc V. Le,et al. Specaugment on Large Scale Datasets , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Yoshua Bengio,et al. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results , 2014, ArXiv.
[8] Brian Kingsbury,et al. Advancing RNN Transducer Technology for Speech Recognition , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[10] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[11] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[12] Zhijian Ou,et al. Efficient Neural Architecture Search for End-to-End Speech Recognition Via Straight-Through Gradients , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[13] Shinji Watanabe,et al. Recent Developments on Espnet Toolkit Boosted By Conformer , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Sanjeev Khudanpur,et al. End-to-end Speech Recognition Using Lattice-free MMI , 2018, INTERSPEECH.
[15] Francis M. Tyers,et al. Common Voice: A Massively-Multilingual Speech Corpus , 2020, LREC.
[16] Zhijian Ou,et al. CRF-based Single-stage Acoustic Modeling with CTC Topology , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Hermann Ney,et al. Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[18] Hermann Ney,et al. Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.
[20] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[22] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[23] Tanja Schultz,et al. Grapheme based speech recognition , 2003, INTERSPEECH.
[24] Zhijian Ou,et al. Deformable TDNN with adaptive receptive fields for speech recognition , 2021, Interspeech 2021.
[25] Keyu An,et al. CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency , 2020, INTERSPEECH.
[26] G. Zweig,et al. Fast, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces , 2020, INTERSPEECH.
[27] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[28] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[29] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[30] Keikichi Hirose,et al. WFST-Based Grapheme-to-Phoneme Conversion: Open Source tools for Alignment, Model-Building and Decoding , 2012, FSMNLP.
[31] Steve Young,et al. The HTK book , 1995 .
[32] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[33] Boris Ginsburg,et al. Jasper: An End-to-End Convolutional Neural Acoustic Model , 2019, INTERSPEECH.
[34] Qian Zhang,et al. Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[36] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[37] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[38] Geoffrey Zweig,et al. From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[39] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[40] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.
[41] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[42] Kyu J. Han,et al. Multistream CNN for Robust Acoustic Modeling , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Sanjeev Khudanpur,et al. Flat-Start Single-Stage Discriminatively Trained HMM-Based Models for ASR , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.