Unsupervised Pre-Training of Bidirectional Speech Encoders via Masked Reconstruction
暂无分享,去创建一个
[1] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[2] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[3] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[4] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[5] Kaisheng Yao,et al. Adaptation of context-dependent deep neural networks for automatic speech recognition , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[6] Yu Zhang,et al. Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data , 2017, NIPS.
[7] James R. Glass,et al. Unsupervised Learning of Spoken Language with Visual Context , 2016, NIPS.
[8] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[9] Hao Tang,et al. An Unsupervised Autoregressive Model for Speech Representation Learning , 2019, INTERSPEECH.
[10] Christian Biemann,et al. Unspeech: Unsupervised Speech Context Embeddings , 2018, INTERSPEECH.
[11] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[12] Aren Jansen,et al. A segmental framework for fully-unsupervised large-vocabulary speech recognition , 2016, Comput. Speech Lang..
[13] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Yoshua Bengio,et al. Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks , 2019, INTERSPEECH.
[15] Ron J. Weiss,et al. Unsupervised Speech Representation Learning Using WaveNet Autoencoders , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Sakriani Sakti,et al. The Zero Resource Speech Challenge 2019: TTS without T , 2019, INTERSPEECH.
[17] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[18] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[19] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[20] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[21] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[22] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[23] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[24] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[25] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[26] T. Nilsson,et al. GPT2: Empirical slant delay model for radio space geodetic techniques , 2013, Geophysical research letters.
[27] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[29] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[30] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.