Realistic Speech-Driven Facial Animation with GANs
暂无分享,去创建一个
Maja Pantic | Stavros Petridis | Konstantinos Vougioukas | M. Pantic | Stavros Petridis | Konstantinos Vougioukas
[1] Yisong Yue,et al. A deep learning approach for generalized speech animation , 2017, ACM Trans. Graph..
[2] Hai Xuan Pham,et al. Speech-Driven 3D Facial Animation with Implicit Emotional Awareness: A Deep Learning Approach , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[3] Léon Bottou,et al. Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.
[4] Chenliang Xu,et al. Deep Cross-Modal Audio-Visual Generation , 2017, ACM Multimedia.
[5] Xiaoming Liu,et al. Face Alignment in Full Pose Range: A 3D Total Solution , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Christoph Bregler,et al. Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.
[7] Yitong Li,et al. Video Generation From Text , 2017, AAAI.
[8] Siwei Lyu,et al. In Ictu Oculi: Exposing AI Generated Fake Face Videos by Detecting Eye Blinking , 2018, ArXiv.
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[11] Lei Xie,et al. A coupled HMM approach to video-realistic speech animation , 2007, Pattern Recognit..
[12] Mahadev Satyanarayanan,et al. OpenFace: A general-purpose face recognition library with mobile applications , 2016 .
[13] A. Bentivoglio,et al. Analysis of blink rate patterns in normal subjects , 1997, Movement disorders : official journal of the Movement Disorder Society.
[14] Chenliang Xu,et al. Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Yoshua Bengio,et al. Generative Adversarial Networks , 2014, ArXiv.
[16] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.
[17] Hai Xuan Pham,et al. Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network , 2018, ArXiv.
[18] Naomi Harte,et al. TCD-TIMIT: An Audio-Visual Corpus of Continuous Speech , 2015, IEEE Transactions on Multimedia.
[19] Takaaki Kuratate,et al. Linking facial animation, head motion and speech acoustics , 2002, J. Phonetics.
[20] Hang Zhou,et al. Talking Face Generation by Adversarially Disentangled Audio-Visual Representation , 2018, AAAI.
[21] Shimon Whiteson,et al. LipNet: End-to-End Sentence-level Lipreading , 2016, 1611.01599.
[22] Ragini Verma,et al. CREMA-D: Crowd-Sourced Emotional Multimodal Actors Dataset , 2014, IEEE Transactions on Affective Computing.
[23] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[24] Siwei Lyu,et al. In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).
[25] Joon Son Chung,et al. You said that? , 2017, BMVC.
[26] Chenliang Xu,et al. Lip Movements Generation at a Glance , 2018, ECCV.
[27] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.
[28] Satoshi Nakamura,et al. Lip movement synthesis from speech based on hidden Markov models , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.
[29] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[30] Maja Pantic,et al. End-to-End Speech-Driven Facial Animation with Temporal GANs , 2018, BMVC.
[31] Subhransu Maji,et al. Visemenet , 2018, ACM Trans. Graph..
[32] Jan Kautz,et al. MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[33] Ira Kemelmacher-Shlizerman,et al. Synthesizing Obama , 2017, ACM Trans. Graph..
[34] Shunta Saito,et al. Temporal Generative Adversarial Nets with Singular Value Clipping , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[35] Francesc Moreno-Noguer,et al. GANimation: Anatomically-aware Facial Animation from a Single Image , 2018, ECCV.
[36] Lei Xie,et al. Photo-real talking head with deep bidirectional LSTM , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Jan Cech,et al. Real-Time Eye Blink Detection using Facial Landmarks , 2016 .
[38] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[39] Frédéric H. Pighin,et al. Expressive speech-driven facial animation , 2005, TOGS.
[40] Joon Son Chung,et al. Out of Time: Automated Lip Sync in the Wild , 2016, ACCV Workshops.
[41] Lina J. Karam,et al. A no-reference perceptual image sharpness metric based on a cumulative probability of blur detection , 2009, 2009 International Workshop on Quality of Multimedia Experience.
[42] Wei Dai,et al. Very deep convolutional neural networks for raw waveforms , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[44] Hani Yehia,et al. Quantitative association of vocal-tract and facial behavior , 1998, Speech Commun..
[45] Antonio Torralba,et al. Generating Videos with Scene Dynamics , 2016, NIPS.
[46] Vladimir Pavlovic,et al. Speech-driven 3 D Facial Animation with Implicit Emotional Awareness : A Deep Learning Approach , 2017 .