暂无分享,去创建一个
Gabriel Synnaeve | Ronan Collobert | Alexei Baevski | Michael Auli | Vineel Pratap | Wei-Ning Hsu | Jacob Kahn | Ann Lee | Tatiana Likhomanenko | Qiantong Xu | Anuroop Sriram | Ronan Collobert | Michael Auli | Anuroop Sriram | Gabriel Synnaeve | Alexei Baevski | T. Likhomanenko | Vineel Pratap | Jacob Kahn | Qiantong Xu | Wei-Ning Hsu | Ann Lee
[1] Tara N. Sainath,et al. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home , 2017, INTERSPEECH.
[2] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] James Glass,et al. Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech , 2020, ICLR.
[4] Richard Socher,et al. An Investigation of Phone-Based Subword Units for End-to-End Speech Recognition , 2020, INTERSPEECH.
[5] Edouard Grave,et al. Reducing Transformer Depth on Demand with Structured Dropout , 2019, ICLR.
[6] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[7] James R. Glass,et al. Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Yoshua Bengio,et al. Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks , 2019, INTERSPEECH.
[9] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[10] Jonathan Le Roux,et al. Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training , 2020, ArXiv.
[11] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Ronan Collobert,et al. Unsupervised Cross-lingual Representation Learning for Speech Recognition , 2020, Interspeech.
[13] Hao Tang,et al. An Unsupervised Autoregressive Model for Speech Representation Learning , 2019, INTERSPEECH.
[14] Francis M. Tyers,et al. Common Voice: A Massively-Multilingual Speech Corpus , 2020, LREC.
[15] Alexei Baevski,et al. Effectiveness of self-supervised pre-training for speech recognition , 2019, ArXiv.
[16] Yongqiang Wang,et al. An investigation of deep neural networks for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[17] Hermann Ney,et al. The Rwth Asr System for Ted-Lium Release 2: Improving Hybrid Hmm With Specaugment , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[19] Yannick Estève,et al. TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation , 2018, SPECOM.
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Gabriel Synnaeve,et al. Wav2Letter++: A Fast Open-source Speech Recognition System , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Gabriel Synnaeve,et al. Rethinking Evaluation in ASR: Are Our Models Robust Enough? , 2020, Interspeech.
[23] Yu Zhang,et al. Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data , 2017, NIPS.
[24] Armand Joulin,et al. Libri-Light: A Benchmark for ASR with Limited or No Supervision , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Richard M. Stern,et al. Features Based on Auditory Physiology and Perception , 2012, Techniques for Noise Robustness in Automatic Speech Recognition.
[26] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[27] Kenneth Heafield,et al. KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.
[28] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[29] Hao Tang,et al. A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition , 2018, INTERSPEECH.
[30] Emmanuel Dupoux,et al. VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation , 2021, ACL.
[31] Armand Joulin,et al. Unsupervised Pretraining Transfers Well Across Languages , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[33] Steven Greenberg,et al. Robust speech recognition using the modulation spectrogram , 1998, Speech Commun..
[34] Yanning Zhang,et al. An unsupervised deep domain adaptation approach for robust speech recognition , 2017, Neurocomputing.