HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes
暂无分享,去创建一个
Yixin Zhu | Siyuan Huang | Wei Liang | Tengyu Liu | Zan Wang | Yixin Chen
[1] Y. Li,et al. Understanding Embodied Reference with Touch-Line Transformer , 2022, ICLR.
[2] Michael J. Black,et al. Capturing and Inferring Dense Full-Body Human-Scene Contact , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Michael J. Black,et al. Human-Aware Object Placement for Visual Environment Reconstruction , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] M. Kawanabe,et al. ScanQA: 3D Question Answering for Spatial Scene Understanding , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Dongdong Chen,et al. 3D Question Answering , 2021, IEEE transactions on visualization and computer graphics.
[6] S. Fidler,et al. Physics-based Human Motion Estimation and Synthesis from Videos , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] Deqian Kong,et al. YouRefIt: Embodied Reference Understanding with Language and Gesture , 2021, IEEE International Conference on Computer Vision.
[8] Ruben Villegas,et al. Stochastic Scene-Aware Motion Prediction , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[9] Mohit Shridhar,et al. Language Grounding with 3D Objects , 2021, CoRL.
[10] Ali Farhadi,et al. LanguageRefer: Spatial-Language Model for 3D Visual Grounding , 2021, CoRL.
[11] Nikos Athanasiou,et al. BABEL: Bodies, Action and Behavior with English Labels , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Bo Dai,et al. Scene-aware Generative Network for Human Motion Synthesis , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Michael J. Black,et al. Action-Conditioned 3D Human Motion Synthesis with Transformer VAE , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] C. Theobalt,et al. Synthesis of Compositional Animations from Textual Descriptions , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[15] Song-Chun Zhu,et al. VLGrammar: Grounded Grammar Induction of Vision and Language , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[16] Joachim Tesch,et al. Populating 3D Scenes by Learning Human-Scene Interaction , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] X. Wang,et al. Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Michael J. Black,et al. We are More than Our Joints: Predicting how 3D Bodies Move , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Klaus Dietmayer,et al. Point Transformer , 2020, IEEE Access.
[20] Weifeng Chen,et al. Learning to Sit: Synthesizing Human-Chair Interactions via Hierarchical Control , 2019, AAAI.
[21] Dimitrios Tzionas,et al. GRAB: A Dataset of Whole-Body Human Grasping of Objects , 2020, ECCV.
[22] Ahmed Abdelreheem,et al. ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes , 2020, ECCV.
[23] Michael J. Black,et al. Generating Person-Scene Interactions in 3D Scenes , 2020, ArXiv.
[24] Yixin Zhu,et al. LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities , 2020, ECCV.
[25] Shihao Zou,et al. Action2Motion: Conditioned Generation of 3D Human Motions , 2020, ACM Multimedia.
[26] Minh Vo,et al. Long-term Human Motion Prediction with Scene Context , 2020, ECCV.
[27] Kris M. Kitani,et al. DLow: Diversifying Latent Flows for Diverse Human Motion Prediction , 2020, ECCV.
[28] J. Tenenbaum,et al. Look, Listen, and Act: Towards Audio-Visual Embodied Navigation , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).
[29] Angel X. Chang,et al. ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language , 2019, ECCV.
[30] Michael J. Black,et al. Generating 3D People in Scenes Without People , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Sebastian Starke,et al. Neural state machine for character-scene interactions , 2019, ACM Trans. Graph..
[32] Song-Chun Zhu,et al. Holistic++ Scene Understanding: Single-View 3D Holistic Scene Parsing and Human Pose Estimation With Human-Object Interaction and Physical Commonsense , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[33] Dimitrios Tzionas,et al. Resolving 3D Human Pose Ambiguities With 3D Scene Constraints , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[34] Louis-Philippe Morency,et al. Language2Pose: Natural Language Grounded Pose Forecasting , 2019, 2019 International Conference on 3D Vision (3DV).
[35] Michael Goesele,et al. The Replica Dataset: A Digital Replica of Indoor Spaces , 2019, ArXiv.
[36] Zhe Wang,et al. Geometric Pose Affordance: 3D Human Pose with Scene Constraints , 2019, ArXiv.
[37] Dhruv Batra,et al. SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[38] Dimitrios Tzionas,et al. Expressive Body Capture: 3D Hands, Face, and Body From a Single Image , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Nikolaus F. Troje,et al. AMASS: Archive of Motion Capture As Surface Shapes , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[40] Wei Liang,et al. Functional Workspace Optimization via Learning Personal Preferences from Virtual Experiences , 2019, IEEE Transactions on Visualization and Computer Graphics.
[41] Yi Zhou,et al. On the Continuity of Rotation Representations in Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[43] Zhen Zhang,et al. Convolutional Sequence to Sequence Model for Human Dynamics , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[44] Xiao Lin,et al. Human Motion Modeling using DVGANs , 2018, ArXiv.
[45] Stefan Lee,et al. Embodied Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[46] Zicheng Liu,et al. HP-GAN: Probabilistic 3D Human Motion Prediction via GAN , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[47] Qi Wu,et al. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[48] Leonidas J. Guibas,et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.
[49] Matthias Nießner,et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Matthias Nießner,et al. PiGraphs , 2016, ACM Trans. Graph..
[51] Honglak Lee,et al. Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.
[52] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[53] Cristian Sminchisescu,et al. Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[54] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[55] Jon Louis Bentley,et al. Multidimensional binary search trees used for associative searching , 1975, CACM.