暂无分享,去创建一个
[1] Francis M. Tyers,et al. Common Voice: A Massively-Multilingual Speech Corpus , 2020, LREC.
[2] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] James R. Glass,et al. Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech , 2018, INTERSPEECH.
[4] Alexei Baevski,et al. Unsupervised Speech Recognition , 2021, ArXiv.
[5] Gabriel Synnaeve,et al. Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters , 2020, INTERSPEECH.
[6] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[7] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[8] Ruslan Salakhutdinov,et al. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[9] Mark J. F. Gales,et al. Speech recognition and keyword spotting for low-resource languages: Babel project research at CUED , 2014, SLTU.
[10] Mark Hasegawa-Johnson,et al. Grapheme-to-Phoneme Transduction for Cross-Language ASR , 2020, SLSP.
[11] Alan W Black,et al. Towards Zero-shot Learning for Automatic Phonemic Transcription , 2020, AAAI.
[12] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[13] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[14] Chris Dyer,et al. PanPhon: A Resource for Mapping IPA Segments to Articulatory Feature Vectors , 2016, COLING.
[15] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[16] Gabriel Synnaeve,et al. MLS: A Large-Scale Multilingual Dataset for Speech Research , 2020, INTERSPEECH.
[17] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[18] Quoc V. Le,et al. Improved Noisy Student Training for Automatic Speech Recognition , 2020, INTERSPEECH.
[19] M. Hasegawa-Johnson,et al. Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding , 2021, Interspeech.
[20] Gabriel Synnaeve,et al. Wav2Letter++: A Fast Open-source Speech Recognition System , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Gabriel Synnaeve,et al. Self-Training and Pre-Training are Complementary for Speech Recognition , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Herman Kamper,et al. Multilingual transfer of acoustic word embeddings improves when training on languages related to the target zero-resource language , 2021, Interspeech 2021.
[23] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[24] Florian Metze,et al. Sequence-Based Multi-Lingual Low Resource Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Gabriel Synnaeve,et al. slimIPL: Language-Model-Free Iterative Pseudo-Labeling , 2020, Interspeech.
[26] Hao Tang,et al. An Unsupervised Autoregressive Model for Speech Representation Learning , 2019, INTERSPEECH.
[27] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[28] Alan W Black,et al. Universal Phone Recognition with a Multilingual Allophone System , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Gabriel Synnaeve,et al. Iterative Pseudo-Labeling for Speech Recognition , 2020, INTERSPEECH.
[30] Edouard Grave,et al. End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures , 2019, ArXiv.
[31] Florian Metze,et al. Differentiable Allophone Graphs for Language-Universal Speech Recognition , 2021, Interspeech.
[32] Lin-Shan Lee,et al. Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings , 2018, INTERSPEECH.
[33] Tanja Schultz,et al. Globalphone: a multilingual speech and text database developed at karlsruhe university , 2002, INTERSPEECH.
[34] Guangsen Wang,et al. Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition , 2020, Interspeech.
[35] Ronan Collobert,et al. Unsupervised Cross-lingual Representation Learning for Speech Recognition , 2020, Interspeech.
[36] Kuan-Yu Chen,et al. Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models , 2019 .