暂无分享,去创建一个
[1] Shinnosuke Takamichi,et al. Training algorithm to deceive Anti-Spoofing Verification for DNN-based speech synthesis , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Quan Wang,et al. Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[4] Xiao Liu,et al. Deep Speaker: an End-to-End Neural Speaker Embedding System , 2017, ArXiv.
[5] Jürgen Schmidhuber,et al. Highway Networks , 2015, ArXiv.
[6] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2017 .
[7] Rafael Valle,et al. Attacking Speaker Recognition With Deep Generative Models , 2018, ArXiv.
[8] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[9] Tomi Kinnunen,et al. ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection , 2019, INTERSPEECH.
[10] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Shinnosuke Takamichi,et al. Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[13] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[14] Junichi Yamagishi,et al. SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2016 .
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[17] Alex Hirschfield,et al. Toward a dynamic framework for security evaluation of voice verification systems , 2009, 2009 IEEE Toronto International Conference Science and Technology for Humanity (TIC-STH).
[18] Hideyuki Tachibana,et al. Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Donald G. Childers,et al. Formant speech synthesis: improving production quality , 1989, IEEE Trans. Acoust. Speech Signal Process..
[20] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[21] Kong-Aik Lee,et al. Introduction to Voice Presentation Attack Detection and Recent Advances , 2019, Handbook of Biometric Anti-Spoofing, 2nd Ed..
[22] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[23] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.