Animating Face using Disentangled Audio Representations
暂无分享,去创建一个
[1] Michael J. Black,et al. Capture, Learning, and Synthesis of 3D Speaking Styles , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Yu Zhang,et al. Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data , 2017, NIPS.
[3] Ragini Verma,et al. CREMA-D: Crowd-Sourced Emotional Multimodal Actors Dataset , 2014, IEEE Transactions on Affective Computing.
[4] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Yoshua Bengio,et al. ObamaNet: Photo-realistic lip-sync from text , 2017, ArXiv.
[6] Hamid Aghajan,et al. Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks , 2018, ArXiv.
[7] Hang Zhou,et al. Talking Face Generation by Adversarially Disentangled Audio-Visual Representation , 2018, AAAI.
[8] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[9] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[10] Yisong Yue,et al. A deep learning approach for generalized speech animation , 2017, ACM Trans. Graph..
[11] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[12] Eugene Fiume,et al. JALI , 2016, ACM Trans. Graph..
[13] Joon Son Chung,et al. LRS3-TED: a large-scale dataset for visual speech recognition , 2018, ArXiv.
[14] Alexei A. Efros,et al. Everybody Dance Now , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[15] Joon Son Chung,et al. You said that? , 2017, BMVC.
[16] Chenliang Xu,et al. Lip Movements Generation at a Glance , 2018, ECCV.
[17] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[18] Georgios Tzimiropoulos,et al. How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[19] Qiang Huo,et al. Video-audio driven real-time facial animation , 2015, ACM Trans. Graph..
[20] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[21] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[22] Joon Son Chung,et al. The Conversation: Deep Audio-Visual Speech Enhancement , 2018, INTERSPEECH.
[23] Ira Kemelmacher-Shlizerman,et al. Synthesizing Obama , 2017, ACM Trans. Graph..
[24] Kevin Wilson,et al. Looking to listen at the cocktail party , 2018, ACM Trans. Graph..
[25] Lei Xie,et al. Photo-real talking head with deep bidirectional LSTM , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Maja Pantic,et al. End-to-End Speech-Driven Facial Animation with Temporal GANs , 2018, BMVC.
[27] Subhransu Maji,et al. Visemenet , 2018, ACM Trans. Graph..
[28] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[29] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[30] Michael J. Black,et al. Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..
[31] Joon Son Chung,et al. Out of Time: Automated Lip Sync in the Wild , 2016, ACCV Workshops.
[32] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.