TEMOS: Generating diverse human motions from textual descriptions
暂无分享,去创建一个
[1] Jianfeng Gao,et al. Unified Contrastive Learning in Image-Text-Label Space , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] T. Komura,et al. FaceFormer: Speech-Driven 3D Facial Animation with Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Dimitrios Tzionas,et al. Embodied Hands: Modeling and Capturing Hands and Bodies Together , 2022, ArXiv.
[4] Lu Yuan,et al. Florence: A New Foundation Model for Computer Vision , 2021, ArXiv.
[5] Nicholas Rewkowski,et al. Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning , 2021, ACM Multimedia.
[6] Ben Saunders,et al. Mixed SIGNals: Sign Language Production via a Mixture of Motion Primitives , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] Yaser Sheikh,et al. MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[8] Michael J. Black,et al. Action-Conditioned 3D Human Motion Synthesis with Transformer VAE , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[9] Andrew Zisserman,et al. Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[10] Philipp Slusallek,et al. Synthesis of Compositional Animations from Textual Descriptions , 2021, ArXiv.
[11] Zhengxia Zou,et al. Single-Shot Motion Completion with Transformer , 2021, ArXiv.
[12] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[13] David A. Ross,et al. AI Choreographer: Music Conditioned 3D Dance Generation with AIST++ , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] Michael J. Black,et al. We are More than Our Joints: Predicting how 3D Bodies Move , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Sanja Fidler,et al. Learning to Generate Diverse Dance Motions with Transformer , 2020, ArXiv.
[16] Shihao Zou,et al. Action2Motion: Conditioned Generation of 3D Human Motions , 2020, ACM Multimedia.
[17] Michael J. Black,et al. Perpetual Motion: Generating Unbounded Human Motion , 2020, ArXiv.
[18] Qiang Ji,et al. Bayesian Adversarial Human Motion Synthesis , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Cristian Sminchisescu,et al. Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows , 2020, ECCV.
[20] Kris M. Kitani,et al. DLow: Diversifying Latent Flows for Diverse Human Motion Prediction , 2020, ECCV.
[21] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[22] Jonas Beskow,et al. MoGlow , 2019, ACM Trans. Graph..
[23] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[24] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[25] Dahua Lin,et al. Convolutional Sequence Generation for Skeleton-Based Action Synthesis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[26] Otmar Hilliges,et al. Structured Prediction Helps 3D Human Motion Modelling , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[27] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[28] Louis-Philippe Morency,et al. Language2Pose: Natural Language Grounded Pose Forecasting , 2019, 2019 International Conference on 3D Vision (3DV).
[29] Jitendra Malik,et al. Learning Individual Styles of Conversational Gesture , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Michael J. Black,et al. Capture, Learning, and Synthesis of 3D Speaking Styles , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Nikolaus F. Troje,et al. AMASS: Archive of Motion Capture As Surface Shapes , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[32] Yi Zhou,et al. On the Continuity of Rotation Representations in Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[34] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[35] Tetsuya Ogata,et al. Paired Recurrent Autoencoders for Bidirectional Translation Between Robot Actions and Linguistic Descriptions , 2018, IEEE Robotics and Automation Letters.
[36] Dario Pavllo,et al. QuaterNet: A Quaternion-based Recurrent Model for Human Motion , 2018, BMVC.
[37] Xiao Lin,et al. Human Motion Modeling using DVGANs , 2018, ArXiv.
[38] Zicheng Liu,et al. HP-GAN: Probabilistic 3D Human Motion Prediction via GAN , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[39] Timothy Ha,et al. Text2Action: Generative Adversarial Synthesis from Language to Action , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[40] Tamim Asfour,et al. Learning a bidirectional mapping between human whole-body motion and natural language using deep recurrent neural networks , 2017, Robotics Auton. Syst..
[41] Raymond J. Mooney,et al. Generating Animated Videos of Human Activities from Natural Language Descriptions , 2018 .
[42] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[43] Taku Komura,et al. A Recurrent Variational Autoencoder for Human Motion Synthesis , 2017, BMVC.
[44] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[45] Michael J. Black,et al. On Human Motion Prediction Using Recurrent Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Tamim Asfour,et al. The KIT Motion-Language Dataset , 2016, Big Data.
[47] Taku Komura,et al. A Deep Learning Framework for Character Motion Synthesis and Editing , 2016, ACM Trans. Graph..
[48] Tao Mei,et al. MSR-VTT: A Large Video Description Dataset for Bridging Video and Language , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Karrie Karahalios,et al. DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization , 2015, UIST.
[50] Michael J. Black,et al. SMPL: A Skinned Multi-Person Linear Model , 2023 .
[51] Tamim Asfour,et al. The KIT whole-body human motion database , 2015, 2015 International Conference on Advanced Robotics (ICAR).
[52] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[53] Stefan Ulbrich,et al. Master Motor Map (MMM) — Framework and toolkit for capturing, representing, and reproducing human motion on humanoid robots , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.
[54] Cristian Sminchisescu,et al. Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[55] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[56] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[57] Cristian Sminchisescu,et al. Latent structured models for human pose estimation , 2011, 2011 International Conference on Computer Vision.
[58] Eduardo de Campos Valadares,et al. Dancing to the music , 2000 .
[59] Michael J. Coombs,et al. Designing for Human-Computer Communication , 1983 .