暂无分享,去创建一个
Hema Swetha Koppula | Ashish Shrivastava | Oncel Tuzel | Jen-Hao Rick Chang | Xiaoshuai Zhang | A. Shrivastava | Oncel Tuzel | H. Koppula | Xiaoshuai Zhang
[1] Thomas Hofmann,et al. Controlling Style and Semantics in Weakly-Supervised Image Generation , 2019, ECCV.
[2] Stefano Ermon,et al. Improved Autoregressive Modeling with Distribution Smoothing , 2021, ICLR.
[3] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.
[4] Nicholas W. D. Evans,et al. Spoofing countermeasures to protect automatic speaker verification from voice conversion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[5] Tao Qin,et al. FastSpeech 2: Fast and High-Quality End-to-End Text to Speech , 2021, ICLR.
[6] Tero Karras,et al. Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Wenbin Cai,et al. Separating Style and Content for Generalized Style Transfer , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[9] Yuxuan Wang,et al. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis , 2018, ICML.
[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[11] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.
[13] Sungwon Kim,et al. Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search , 2020, NeurIPS.
[14] Kou Tanaka,et al. StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion , 2019, INTERSPEECH.
[15] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Junichi Yamagishi,et al. Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion , 2020, Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020.
[17] M. Villegas,et al. GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images , 2020, ECCV.
[18] Tomi Kinnunen,et al. Spoofing and countermeasures for automatic speaker verification , 2013, INTERSPEECH.
[19] Bjorn Ommer,et al. A Disentangling Invertible Interpretation Network for Explaining Latent Representations , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Tao Qin,et al. AdaSpeech: Adaptive Text to Speech for Custom Voice , 2021, ICLR.
[21] Ganesh Sivaraman,et al. Generalization of Audio Deepfake Detection , 2020, Odyssey.
[22] Mark Hasegawa-Johnson,et al. Zero-Shot Voice Style Transfer with Only Autoencoder Loss , 2019, ICML.
[23] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[24] Yong Jae Lee,et al. FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[26] Richard Zhang,et al. Making Convolutional Networks Shift-Invariant Again , 2019, ICML.
[27] Stefanie Tellex,et al. Generating Handwriting via Decoupled Style Descriptors , 2020, ECCV.
[28] M. Hasegawa-Johnson,et al. Unsupervised Speech Decomposition via Triple Information Bottleneck , 2020, ICML.
[29] Guillaume Lample,et al. Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.
[30] Ryan Prenger,et al. Mellotron: Multispeaker Expressive Voice Synthesis by Conditioning on Rhythm, Pitch and Global Style Tokens , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Hyrum S. Anderson,et al. The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation , 2018, ArXiv.
[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[33] Raja Bala,et al. Editing in Style: Uncovering the Local Semantics of GANs , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Kou Tanaka,et al. Cyclegan-VC2: Improved Cyclegan-based Non-parallel Voice Conversion , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Douglas A. Reynolds,et al. Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[36] K. Simonyan,et al. End-to-End Adversarial Text-to-Speech , 2020, ICLR.
[37] Siwei Lyu,et al. Deepfake Detection: Current Challenges and Next Steps , 2020, 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).
[38] Bjorn Ommer,et al. Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[39] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[40] Stefanos Zafeiriou,et al. ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Bolei Zhou,et al. Interpreting the Latent Space of GANs for Semantic Face Editing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Harshad Rai,et al. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .
[43] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit (version 0.92) , 2019 .
[44] Sercan Ömer Arik,et al. Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning , 2017, ICLR.
[45] Otmar Hilliges,et al. DeepWriting: Making Digital Ink Editable via Deep Generative Modeling , 2018, CHI.
[46] Brian L. Price,et al. Text and Style Conditioned GAN for the Generation of Offline-Handwriting Lines , 2020, BMVC.
[47] Aleksandr Sizov,et al. ASVspoof: The Automatic Speaker Verification Spoofing and Countermeasures Challenge , 2017, IEEE Journal of Selected Topics in Signal Processing.
[48] Leon A. Gatys,et al. A Neural Algorithm of Artistic Style , 2015, ArXiv.