Voxel-informed Language Grounding
暂无分享,去创建一个
[1] L. Guibas,et al. PartGlot: Learning Shape Part Segmentation from Language Reference Games , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Dong Xu,et al. 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[3] Mohit Shridhar,et al. Language Grounding with 3D Objects , 2021, CoRL.
[4] Dieter Fox,et al. A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution , 2021, CoRL.
[5] Ali Farhadi,et al. LanguageRefer: Spatial-Language Model for 3D Visual Grounding , 2021, CoRL.
[6] Federico Tombari,et al. LegoFormer: Transformers for Block-by-Block Multi-view 3D Reconstruction , 2021, ArXiv.
[7] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[8] S. Gershman,et al. Language-Mediated, Object-Centric Representation Learning , 2020, FINDINGS.
[9] Dieter Fox,et al. ACRONYM: A Large-Scale Grasp Dataset Based on Simulation , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[10] Jacob Andreas,et al. Task-Oriented Dialogue as Dataflow Synthesis , 2020, Transactions of the Association for Computational Linguistics.
[11] Ahmed Abdelreheem,et al. ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes , 2020, ECCV.
[12] Ruslan Salakhutdinov,et al. Object Goal Navigation using Goal-Oriented Semantic Exploration , 2020, NeurIPS.
[13] Emily M. Bender,et al. Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data , 2020, ACL.
[14] Jacob Andreas,et al. Experience Grounds Language , 2020, EMNLP.
[15] Angel X. Chang,et al. ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language , 2019, ECCV.
[16] Marcus Rohrbach,et al. 12-in-1: Multi-Task Vision and Language Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Adam W. Harley,et al. Embodied Language Grounding With 3D Visual Feature Representations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[19] Christopher D. Manning,et al. Learning by Abstraction: The Neural State Machine , 2019, NeurIPS.
[20] Louis-Philippe Morency,et al. Language2Pose: Natural Language Grounded Pose Forecasting , 2019, 2019 International Conference on 3D Vision (3DV).
[21] Leonidas J. Guibas,et al. Shapeglot: Learning Language for Shape Differentiation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[22] Silvio Savarese,et al. Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings , 2018, ACCV.
[23] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[24] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[25] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[26] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.
[28] Pat Hanrahan,et al. Semantically-enriched 3D models for common-sense knowledge , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[29] Angel X. Chang,et al. Learning Spatial Knowledge for Text to 3D Scene Generation , 2014, EMNLP.
[30] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[31] Jianxiong Xiao,et al. 3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Yejin Choi,et al. Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.
[33] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[34] Joelle Pineau,et al. Towards robotic assistants in nursing homes: Challenges and results , 2003, Robotics Auton. Syst..
[35] Angela S. Lin,et al. Generating Animated Videos of Human Activities from Natural Language Descriptions , 2018 .