ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
暂无分享,去创建一个
Ali Farhadi | Aniruddha Kembhavi | Kiana Ehsani | Luca Weihs | Winson Han | Alvaro Herrasti | Eric Kolve | Eli VanderBilt | Jordi Salvador | Roozbeh Mottaghi | Matt Deitke
[1] J. Togelius,et al. PCGRL: Procedural Content Generation via Reinforcement Learning , 2020, AIIDE.
[2] J. Tenenbaum,et al. Learning Neuro-Symbolic Relational Transition Models for Bilevel Planning , 2021, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[3] Jaime Fern'andez del R'io,et al. Array programming with NumPy , 2020, Nature.
[4] Aric Hagberg,et al. Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.
[5] Chandan Yeshwanth,et al. SceneFormer: Indoor Scene Generation with Transformers , 2020, 2021 International Conference on 3D Vision (3DV).
[6] Maneesh Agrawala,et al. SceneSuggest: Context-driven 3D Scene Design , 2017, ArXiv.
[7] Luisa Caldas,et al. SceneGen: Generative Contextual Scene Augmentation using Scene Graph Priors , 2020, ArXiv.
[8] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[9] Stefan Lee,et al. EvalAI: Towards Better Evaluation Systems for AI Agents , 2019, ArXiv.
[10] Timnit Gebru,et al. Datasheets for datasets , 2018, Commun. ACM.
[11] Silvio Savarese,et al. ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation , 2020, ArXiv.
[12] Dhruv Batra,et al. Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Daniel Cohen-Or,et al. GRAINS , 2018, ACM Trans. Graph..
[14] Kristen Grauman,et al. Semantic Audio-Visual Navigation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Alexander Toshev,et al. ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects , 2020, ArXiv.
[16] Yasutaka Furukawa,et al. House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects , 2021, Computer Vision and Pattern Recognition.
[17] Wes McKinney,et al. pandas: a Foundational Python Library for Data Analysis and Statistics , 2011 .
[18] Jitendra Malik,et al. RMA: Rapid Motor Adaptation for Legged Robots , 2021, Robotics: Science and Systems.
[19] K. Grauman,et al. SoundSpaces: Audio-Visual Navigation in 3D Environments , 2019, ECCV.
[20] Silvio Savarese,et al. Robot Navigation in Constrained Pedestrian Environments using Reinforcement Learning , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[21] Towards Disturbance-Free Visual Mobile Manipulation , 2021, ArXiv.
[22] Sanja Fidler,et al. Learning to Simulate Dynamic Environments With GameGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Leon L. Xu,et al. ABO: Dataset and Benchmarks for Real-World 3D Object Understanding , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Pulkit Agrawal,et al. Stubborn: A Strong Baseline for Indoor Object Navigation , 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[25] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.
[26] Dragomir Anguelov,et al. Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[28] Yashraj S. Narang,et al. Factory: Fast Contact for Robotic Assembly , 2022, Robotics: Science and Systems.
[29] Klaus C. J. Dietmayer,et al. Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges , 2019, IEEE Transactions on Intelligent Transportation Systems.
[30] Rafael Bidarra,et al. A Constrained Growth Method for Procedural Floor Plan Generation , 2010 .
[31] Radu Soricut,et al. Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Jungseock Joo,et al. Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[33] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[34] Leonidas J. Guibas,et al. PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] John D. Hunter,et al. Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.
[36] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.
[37] Ali Farhadi,et al. SeGAN: Segmenting and Generating the Invisible , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[38] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[39] Angel X. Chang,et al. Habitat 2.0: Training Home Assistants to Rearrange their Habitat , 2021, NeurIPS.
[40] Yuandong Tian,et al. Building Generalizable Agents with a Realistic and Rich 3D Environment , 2018, ICLR.
[41] J. Tenenbaum,et al. Look, Listen, and Act: Towards Audio-Visual Embodied Navigation , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).
[42] Ali Farhadi,et al. Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Ludwig Schmidt,et al. CLIP on Wheels: Zero-Shot Object Navigation as Object Localization and Exploration , 2022, ArXiv.
[44] Josh H. McDermott,et al. ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation , 2020, NeurIPS Datasets and Benchmarks.
[45] Henry O. Velesaca,et al. Camera pose estimation in multi-view environments: From virtual scenarios to the real world , 2021, Image Vis. Comput..
[46] Leonidas J. Guibas,et al. ObjectNet3D: A Large Scale Database for 3D Object Recognition , 2016, ECCV.
[47] G. Konidaris,et al. Towards Optimal Correlational Object Search , 2021, 2022 International Conference on Robotics and Automation (ICRA).
[48] Leland McInnes,et al. UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..
[49] Roberto Mart'in-Mart'in,et al. robosuite: A Modular Simulation Framework and Benchmark for Robot Learning , 2020, ArXiv.
[50] Roozbeh Mottaghi,et al. Visual Room Rearrangement , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Jitendra Malik,et al. On Evaluation of Embodied Navigation Agents , 2018, ArXiv.
[52] Chongyang Ma,et al. Deep Generative Modeling for Scene Synthesis via Hybrid Representations , 2018, ACM Trans. Graph..
[53] Joshua B. Tenenbaum,et al. The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion Planning Benchmark Towards Physically Realistic Embodied AI , 2021, 2022 International Conference on Robotics and Automation (ICRA).
[54] Kiana Ehsani,et al. Continuous Scene Representations for Embodied AI , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Roozbeh Mottaghi,et al. ManipulaTHOR: A Framework for Visual Object Manipulation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Ari S. Morcos,et al. DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames , 2019, ICLR.
[57] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[58] Dilek Z. Hakkani-Tür,et al. TEACh: Task-driven Embodied Agents that Chat , 2021, AAAI.
[59] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[60] Sanja Fidler,et al. VirtualHome: Simulating Household Activities Via Programs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[61] David J. Fleet,et al. Kubric: A scalable dataset generator , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Kalyan Sunkavalli,et al. OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[63] Patrick Labatut,et al. Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[64] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[65] Hao Zhang,et al. Graph2Plan , 2020, ACM Trans. Graph..
[66] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[67] Ali Farhadi,et al. Two Body Problem: Collaborative Visual Task Completion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[68] Cynthia Matuszek,et al. A Simulator for Human-Robot Interaction in Virtual Reality , 2021, 2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW).
[69] Lyne P. Tchapmi,et al. iGibson 1.0: A Simulation Environment for Interactive Tasks in Large Realistic Scenes , 2020, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[70] R. Mottaghi,et al. Simple but Effective: CLIP Embeddings for Embodied AI , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[71] Ali Farhadi,et al. AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.
[72] Sonia Chernova,et al. Sim2Real Predictivity: Does Evaluation in Simulation Predict Real-World Performance? , 2019, IEEE Robotics and Automation Letters.
[73] P. Abbeel,et al. Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents , 2022, ICML.
[74] Natasha Jaques,et al. Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design , 2020, NeurIPS.
[75] Ali Farhadi,et al. RoboTHOR: An Open Simulation-to-Real Embodied AI Platform , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[76] Evangelos Kalogerakis,et al. SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[77] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.
[78] Tongzhou Mu,et al. ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations , 2021, NeurIPS Datasets and Benchmarks.
[79] Qiang Xu,et al. nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[80] Michael L. Waskom,et al. Seaborn: Statistical Data Visualization , 2021, J. Open Source Softw..
[81] Vincent Vanhoucke,et al. Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household Items , 2022, 2022 International Conference on Robotics and Automation (ICRA).
[82] Andrew J. Davison,et al. RLBench: The Robot Learning Benchmark & Learning Environment , 2019, IEEE Robotics and Automation Letters.
[83] Kai Xu,et al. Learning Generative Models of 3D Structures , 2020, Eurographics.
[84] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[85] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[86] Silvio Savarese,et al. iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks , 2021, CoRL.
[87] Ali Farhadi,et al. IQA: Visual Question Answering in Interactive Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[88] Ali Farhadi,et al. A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks , 2020, ECCV.
[89] Roozbeh Mottaghi,et al. AllenAct: A Framework for Embodied AI Research , 2020, ArXiv.
[90] Jitendra Malik,et al. Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[91] Christopher Potts,et al. Text to 3D Scene Generation with Rich Lexical Grounding , 2015, ACL.
[92] Jason Baldridge,et al. Pathdreamer: A World Model for Indoor Navigation , 2021, ALVR.
[93] Ali Farhadi,et al. Object Manipulation via Visual Target Localization , 2022, ECCV.
[94] Cynthia Matuszek,et al. Head Pose as a Proxy for Gaze in Virtual Reality , 2022 .
[95] Silvio Savarese,et al. BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments , 2021, CoRL.
[96] Julian Togelius,et al. Learning Controllable Content Generators , 2021, 2021 IEEE Conference on Games (CoG).
[97] Jacob Krantz,et al. Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments , 2020, ECCV.
[98] StandardSim: A Synthetic Dataset For Retail Environments , 2022, ArXiv.
[99] Oriol Vinyals,et al. Flamingo: a Visual Language Model for Few-Shot Learning , 2022, ArXiv.
[100] Alexander M. Rush,et al. Datasets: A Community Library for Natural Language Processing , 2021, EMNLP.
[101] Angel X. Chang,et al. Interactive Learning of Spatial Knowledge for Text to 3D Scene Generation , 2014 .
[102] Dorsa Sadigh,et al. Learning Adaptive Language Interfaces through Decomposition , 2020, INTEXSEMPAR.
[103] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[104] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[105] Pratul P. Srinivasan,et al. Block-NeRF: Scalable Large Scene Neural View Synthesis , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[106] Angel X. Chang,et al. Learning Spatial Knowledge for Text to 3D Scene Generation , 2014, EMNLP.
[107] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[108] Vladlen Koltun,et al. Megaverse: Simulating Embodied Agents at One Million Experiences per Second , 2021, ICML.
[109] Kai Wang,et al. Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[110] Leonidas J. Guibas,et al. SAPIEN: A SimulAted Part-Based Interactive ENvironment , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[111] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[112] Roozbeh Mottaghi,et al. Interactron: Embodied Adaptive Object Detection , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[113] Max Jaderberg,et al. Open-Ended Learning Leads to Generally Capable Agents , 2021, ArXiv.
[114] Jitendra Malik,et al. Gibson Env: Real-World Perception for Embodied Agents , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[115] Joao Marques-Silva,et al. PySAT: A Python Toolkit for Prototyping with SAT Oracles , 2018, SAT.
[116] Abhinav Gupta,et al. Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[117] Rui Tang,et al. Data-driven interior plan generation for residential buildings , 2019, ACM Trans. Graph..
[118] Joel Nothman,et al. SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.
[119] Angel X. Chang,et al. Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI , 2021, NeurIPS Datasets and Benchmarks.
[120] Luke Zettlemoyer,et al. ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[121] Fernando Marson,et al. Automatic Real-Time Generation of Floor Plans Based on Squarified Treemaps Algorithm , 2010, Int. J. Comput. Games Technol..
[122] Vincent Sitzmann,et al. 3D Neural Scene Representations for Visuomotor Control , 2021, CoRL.
[123] Simple and Effective Synthesis of Indoor 3D Scenes , 2022, 2204.02960.