End-To-End Generation of Talking Faces from Noisy Speech
暂无分享,去创建一个
Chenliang Xu | Ross K. Maddox | Sefik Emre Eskimez | Zhiyao Duan | S. Eskimez | Chenliang Xu | R. Maddox | Zhiyao Duan | Z. Duan
[1] J. Gower. Generalized procrustes analysis , 1975 .
[2] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.
[3] R. Freyman,et al. The role of visual speech cues in reducing energetic and informational masking. , 2005, The Journal of the Acoustical Society of America.
[4] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[5] Davis E. King,et al. Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..
[6] Joshua G. W. Bernstein,et al. Auditory and auditory-visual intelligibility of speech in fluctuating maskers for normal-hearing and hearing-impaired listeners. , 2009, The Journal of the Acoustical Society of America.
[7] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[8] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[9] Adrian K. C. Lee,et al. Auditory selective attention is enhanced by a task-irrelevant temporally coherent visual stimulus in human listeners , 2015, eLife.
[10] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[11] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[12] Ira Kemelmacher-Shlizerman,et al. Synthesizing Obama , 2017, ACM Trans. Graph..
[13] Joon Son Chung,et al. You said that? , 2017, BMVC.
[14] Chenliang Xu,et al. Lip Movements Generation at a Glance , 2018, ECCV.
[15] Maja Pantic,et al. End-to-End Speech-Driven Facial Animation with Temporal GANs , 2018, BMVC.
[16] Hang Zhou,et al. Talking Face Generation by Adversarially Disentangled Audio-Visual Representation , 2018, AAAI.
[17] Jingwen Zhu,et al. Talking Face Generation by Conditional Recurrent Adversarial Network , 2018, IJCAI.
[18] Chenliang Xu,et al. Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Maja Pantic,et al. Realistic Speech-Driven Facial Animation with GANs , 2019, International Journal of Computer Vision.
[20] Chenliang Xu,et al. Noise-Resilient Training Method for Face Landmark Generation From Speech , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.