Speech Recognition with Augmented Synthesized Speech
暂无分享,去创建一个
Bhuvana Ramabhadran | Ye Jia | Yu Zhang | Zelin Wu | Pedro Moreno | Andrew Rosenberg | Yonghui Wu | Yonghui Wu | B. Ramabhadran | P. Moreno | Yu Zhang | A. Rosenberg | Ye Jia | Zelin Wu
[1] Yu Zhang,et al. Latent Sequence Decompositions , 2016, ICLR.
[2] Xiaodong Cui,et al. Data Augmentation for Deep Neural Network Acoustic Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[3] Fadi Biadsy,et al. Effectively Building Tera Scale MaxEnt Language Models Incorporating Non-Linguistic Signals , 2017, INTERSPEECH.
[4] Satoshi Nakamura,et al. Listening while speaking: Speech chain by deep learning , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[5] Tara N. Sainath,et al. Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling , 2019, ArXiv.
[6] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[7] Heiga Zen,et al. Hierarchical Generative Modeling for Controllable Speech Synthesis , 2018, ICLR.
[8] Satoshi Nakamura,et al. Machine Speech Chain with One-shot Speaker Adaptation , 2018, INTERSPEECH.
[9] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[10] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Boris Ginsburg,et al. Training Neural Speech Recognition Systems with Synthetic Speech Augmentation , 2018, ArXiv.
[13] Yuxuan Wang,et al. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis , 2018, ICML.
[14] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[15] Erica Lindsay Cooper,et al. Text-to-Speech Synthesis Using Found Data for Low-Resource Languages , 2019 .
[16] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] P. Denes,et al. The speech chain : the physics and biology of spoken language , 1963 .
[18] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] P. Denes. The Speech Chain , 1963 .
[20] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.