Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional-Mixture Approach
暂无分享,去创建一个
Yukiko I. Nakano | Louis-Philippe Morency | Chaitanya Ahuja | Dong Won Lee | Louis-Philippe Morency | Y. Nakano | Chaitanya Ahuja | Dong Won Lee
[1] Jitendra Malik,et al. Learning Individual Styles of Conversational Gesture , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Michael Neff,et al. Multi-objective adversarial gesture generation , 2019, MIG.
[3] Yaser Sheikh,et al. To React or not to React: End-to-End Visual Pose Forecasting for Personalized Avatar during Dyadic Conversations , 2019, ICMI.
[4] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[5] 拓海 杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .
[6] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Alexei A. Efros,et al. Toward Multimodal Image-to-Image Translation , 2017, NIPS.
[8] Ira Kemelmacher-Shlizerman,et al. Audio to Body Dynamics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[9] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.
[10] Kazuhiko Sumi,et al. Evaluation of Speech-to-Gesture Generation Using Bi-Directional LSTM Network , 2018, IVA.
[11] Bernt Schiele,et al. 2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[12] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.
[13] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[14] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Christian Obermeier,et al. A speaker's gesture style can affect language comprehension: ERP evidence from gesture-speech integration. , 2015, Social cognitive and affective neuroscience.
[16] Sergey Levine,et al. Real-time prosody-driven synthesis of body language , 2009, ACM Trans. Graph..
[17] Joon Son Chung,et al. Disentangled Speech Embeddings Using Cross-Modal Self-Supervision , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Hans-Peter Seidel,et al. Annotated New Text Engine Animation Animation Lexicon Animation Gesture Profiles MR : . . . JL : . . . Gesture Generation Video Annotated Gesture Script , 2007 .
[19] Douglas A. Reynolds,et al. Gaussian Mixture Models , 2018, Encyclopedia of Biometrics.
[20] Ming-Yu Liu,et al. Coupled Generative Adversarial Networks , 2016, NIPS.
[21] Yu-Ding Lu,et al. DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2020, International Journal of Computer Vision.
[22] Inbar Mosseri,et al. XGAN: Unsupervised Image-to-Image Translation for many-to-many Mappings , 2017, Domain Adaptation for Visual Understanding.
[23] Stefan Kopp,et al. Increasing the expressiveness of virtual agents: autonomous generation of speech and gesture for spatial description tasks , 2009, AAMAS.
[24] Robert O. Davis,et al. Sometimes more is better: Agent gestures, procedural knowledge and the foreign language learner , 2019, Br. J. Educ. Technol..
[25] Vighnesh Birodkar,et al. Unsupervised Learning of Disentangled Representations from Video , 2017, NIPS.
[26] Leon A. Gatys,et al. A Neural Algorithm of Artistic Style , 2015, ArXiv.
[27] Stacy Marsella,et al. Gesture generation with low-dimensional embeddings , 2014, AAMAS.
[28] Louis-Philippe Morency,et al. Language2Pose: Natural Language Grounded Pose Forecasting , 2019, 2019 International Conference on 3D Vision (3DV).
[29] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Shakir Mohamed,et al. Variational Approaches for Auto-Encoding Generative Adversarial Networks , 2017, ArXiv.
[31] Justine Cassell,et al. BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.
[32] S. P. Lloyd,et al. Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.
[33] A. Murat Tekalp,et al. Analysis of Head Gesture and Prosody Patterns for Prosody-Driven Head-Gesture Animation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[34] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[35] Yongguo Kang,et al. Multi-reference Tacotron by Intercross Training for Style Disentangling, Transfer and Control in Speech Synthesis , 2019, ArXiv.
[36] Yaser Sheikh,et al. Hand Keypoint Detection in Single Images Using Multiview Bootstrapping , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Yingyu Liang,et al. Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.
[38] Yaser Sheikh,et al. Recycle-GAN: Unsupervised Video Retargeting , 2018, ECCV.
[39] Seunghoon Hong,et al. Decomposing Motion and Content for Natural Video Sequence Prediction , 2017, ICLR.
[40] Jan Kautz,et al. Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.
[41] Jan Kautz,et al. Unsupervised Image-to-Image Translation Networks , 2017, NIPS.
[42] Yuxuan Wang,et al. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis , 2018, ICML.
[43] Wei-Shi Zheng,et al. MIXGAN: Learning Concepts from Different Domains for Mixture Generation , 2018, IJCAI.
[44] Carlos Busso,et al. Novel Realizations of Speech-Driven Head Movements with Generative Adversarial Networks , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[45] Trung Le,et al. MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.
[46] Jeremy N. Bailenson,et al. The Effect of Behavioral Realism and Form Realism of Real-Time Avatar Faces on Verbal Disclosure, Nonverbal Disclosure, Emotion Recognition, and Copresence in Dyadic Interaction , 2006, PRESENCE: Teleoperators and Virtual Environments.
[47] Stacy Marsella,et al. Predicting Co-verbal Gestures: A Deep and Temporal Modeling Approach , 2015, IVA.
[48] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[49] Eduardo de Campos Valadares,et al. Dancing to the music , 2000 .
[50] Stefan Kopp,et al. Gesture and speech in interaction: An overview , 2014, Speech Commun..
[51] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[52] Daniel McDuff,et al. Neural TTS Stylization with Adversarial and Collaborative Games , 2018, ICLR.
[53] Naoshi Kaneko,et al. Analyzing Input and Output Representations for Speech-Driven Gesture Generation , 2019, IVA.
[54] A. Braun,et al. Symbolic gestures and spoken language are processed by a common neural system , 2009, Proceedings of the National Academy of Sciences.
[55] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[56] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[57] Catherine Pelachaud,et al. Studies on gesture expressivity for a virtual agent , 2009, Speech Commun..
[58] S. Levine,et al. Gesture controllers , 2010, ACM Trans. Graph..
[59] Sai Krishna Rallabandi,et al. Disentangling Speech and Non-Speech Components for Building Robust Acoustic Models from Found Data , 2019, ArXiv.
[60] M. Studdert-Kennedy. Hand and Mind: What Gestures Reveal About Thought. , 1994 .
[61] Yingying Wang,et al. Efficient Neural Networks for Real-time Motion Style Transfer , 2019, PACMCGIT.