暂无分享,去创建一个
Ning Cheng | Shiyu Zhou | Bo Xu | Cheng Yi | Jianzhong Wang | Ning Cheng | Bo Xu | Shiyu Zhou | Cheng Yi | Jianzhong Wang
[1] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[2] Yan Li,et al. The Speechtransformer for Large-scale Mandarin Chinese Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[4] Linhao Dong,et al. CIF: Continuous Integrate-And-Fire for End-To-End Speech Recognition , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Ron J. Weiss,et al. Unsupervised Speech Representation Learning Using WaveNet Autoencoders , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[6] Shuang Xu,et al. Multilingual End-to-End Speech Recognition with A Single Transformer on Low-Resource Languages , 2018, ArXiv.
[7] Gabriel Synnaeve,et al. Wav2Letter++: A Fast Open-source Speech Recognition System , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[9] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[10] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[11] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[12] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[13] Pascale Fung,et al. HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus , 2006, ISCSLP.
[14] Tatsuya Kawahara,et al. Transfer Learning of Language-independent End-to-end ASR with Language Model Fusion , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[16] Matt Shannon,et al. Recurrent Neural Aligner: An Encoder-Decoder Neural Network Model for Sequence to Sequence Mapping , 2017, INTERSPEECH.
[17] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[18] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[19] Yoshua Bengio,et al. Speech Model Pre-training for End-to-End Spoken Language Understanding , 2019, INTERSPEECH.
[20] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[22] Xiangang Li,et al. Improving Transformer-based Speech Recognition Using Unsupervised Pre-training , 2019, ArXiv.
[23] Ronan Collobert,et al. Wav2Letter++: A Fast Open-source Speech Recognition System , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Shuang Xu,et al. Multilingual Recurrent Neural Networks with Residual Learning for Low-Resource Speech Recognition , 2017, INTERSPEECH.
[25] Bo Xu,et al. Ectc-Docd: An End-to-End Structure with CTC Encoder and OCD Decoder for Speech Recognition , 2019, INTERSPEECH.