Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models
暂无分享,去创建一个
M. Pantic | Stavros Petridis | Konstantinos Vougioukas | Rodrigo Mira | Nikita Drobyshev | Antoni Bigata Casademunt
[1] Quoc V. Le,et al. Symbolic Discovery of Optimization Algorithms , 2023, NeurIPS.
[2] Quoc V. Le,et al. Noise2Music: Text-conditioned Music Generation with Diffusion Models , 2023, ArXiv.
[3] B. Schölkopf,et al. Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion , 2023, ArXiv.
[4] M. Pantic,et al. Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation , 2023, arXiv.org.
[5] Radu Tudor Ionescu,et al. Diffusion Models in Vision: A Survey , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Jiwen Lu,et al. DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis , 2023, ArXiv.
[7] Daniel C. Tompkins,et al. BEATs: Audio Pre-Training with Acoustic Tokenizers , 2022, ICML.
[8] Ming-Yu Liu,et al. SPACE: Speech-driven Portrait Animation with Controllable Expression , 2022, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[9] David J. Fleet,et al. Imagen Video: High Definition Video Generation with Diffusion Models , 2022, ArXiv.
[10] Yaniv Taigman,et al. Make-A-Video: Text-to-Video Generation without Text-Video Data , 2022, ICLR.
[11] Jonathan Ho. Classifier-Free Diffusion Guidance , 2022, ArXiv.
[12] V. Lempitsky,et al. MegaPortraits: One-shot Megapixel Neural Head Avatars , 2022, ACM Multimedia.
[13] Xiaoguang Han,et al. Expressive Talking Head Generation with Granular Audio-Visual Control , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Tero Karras,et al. Elucidating the Design Space of Diffusion-Based Generative Models , 2022, NeurIPS.
[15] Wayne Wu,et al. EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model , 2022, SIGGRAPH.
[16] David J. Fleet,et al. Video Diffusion Models , 2022, NeurIPS.
[17] Juan F. Montesinos,et al. VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices , 2022, INTERSPEECH.
[18] Karsten Kreis,et al. Tackling the Generative Learning Trilemma with Denoising Diffusion GANs , 2021, ICLR.
[19] J. Malik,et al. MViTv2: Improved Multiscale Vision Transformers for Classification and Detection , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Jinyu Li,et al. WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing , 2021, IEEE Journal of Selected Topics in Signal Processing.
[21] Andrea Vedaldi,et al. Audio-Visual Synchronisation in the wild , 2021, BMVC.
[22] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.
[23] Chen Change Loy,et al. Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Shitong Luo,et al. Diffusion Probabilistic Models for 3D Point Cloud Generation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.
[26] C. V. Jawahar,et al. A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild , 2020, ACM Multimedia.
[27] Thierry Dutoit,et al. Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning , 2020, INTERSPEECH.
[28] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.
[29] Tero Karras,et al. Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.
[30] Brojeshwar Bhowmick,et al. Identity-Preserving Realistic Talking Face Generation , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).
[31] Yang Zhou,et al. MakeltTalk , 2020, ACM Trans. Graph..
[32] Yang Song,et al. Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.
[33] Maja Pantic,et al. Realistic Speech-Driven Facial Animation with GANs , 2019, International Journal of Computer Vision.
[34] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[35] Maja Pantic,et al. End-to-End Speech-Driven Facial Animation with Temporal GANs , 2018, BMVC.
[36] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[37] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[38] Joon Son Chung,et al. You said that? , 2017, BMVC.
[39] Aren Jansen,et al. Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[40] Pascale Fung,et al. A Long Short-Term Memory Framework for Predicting Humor in Dialogues , 2016, NAACL.
[41] William Curran,et al. Laughter Research: A Review of the ILHAIRE Project , 2016, Toward Robotic Socially Believable Behaving Systems.
[42] Lei Xie,et al. Photo-real talking head with deep bidirectional LSTM , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.
[44] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[45] Catherine Pelachaud,et al. Laughter animation synthesis , 2014, AAMAS.
[46] Thierry Dutoit,et al. Automatic Phonetic Transcription of Laughter and Its Application to Laughter Synthesis , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.
[47] Maja Pantic,et al. The MAHNOB Laughter database , 2013, Image Vis. Comput..
[48] Thierry Dutoit,et al. The AVLaughterCycle Database , 2010, LREC.
[49] Björn Schuller,et al. Being bored? Recognising natural interest by extensive audiovisual integration for real-life application , 2009, Image Vis. Comput..
[50] Alex Pentland,et al. Honest Signals - How They Shape Our World , 2008 .
[51] Dirk Heylen,et al. The Sensitive Artificial Listner: an induction technique for generating emotionally coloured conversation , 2008 .
[52] Lei Xie,et al. A coupled HMM approach to video-realistic speech animation , 2007, Pattern Recognit..
[53] J. Trouvain,et al. IMITATING CONVERSATIONAL LAUGHTER WITH AN ARTICULATORY SPEECH SYNTHESIZER , 2007 .
[54] Phillip J. Glenn. Laughter in Interaction , 2003 .
[55] P. Ekman,et al. The expressive pattern of laughter , 2001 .
[56] Joshua Foer,et al. Laughter: A Scientific Investigation , 2001, The Yale Journal of Biology and Medicine.
[57] Hani Yehia,et al. Quantitative association of vocal-tract and facial behavior , 1998, Speech Commun..
[58] Satoshi Nakamura,et al. Lip movement synthesis from speech based on hidden Markov models , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.