iGibson, a Simulation Environment for Interactive Tasks in Large Realistic Scenes

We present iGibson, a novel simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes. Our environment contains fifteen fully interactive home-sized scenes populated with rigid and articulated objects. The scenes are replicas of 3D scanned real-world homes, aligning the distribution of objects and layout to that of the real world. iGibson integrates several key features to facilitate the study of interactive tasks: i) generation of high-quality visual virtual sensor signals (RGB, depth, segmentation, LiDAR, flow, among others), ii) domain randomization to change the materials of the objects (both visual texture and dynamics) and/or their shapes, iii) integrated sampling-based motion planners to generate collision-free trajectories for robot bases and arms, and iv) intuitive human-iGibson interface that enables efficient collection of human demonstrations. Through experiments, we show that the full interactivity of the scenes enables agents to learn useful visual representations that accelerate the training of downstream manipulation tasks. We also show that iGibson features enable the generalization of navigation agents, and that the human-iGibson interface and integrated motion planners facilitate efficient imitation learning of simple human demonstrated behaviors. iGibson is open-sourced with comprehensive examples and documentation. For more information, visit our project website: this http URL

[1]  Silvio Savarese,et al.  6-PACK: Category-level 6D Pose Tracker with Anchor-Based Keypoints , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[3]  Francesc Moreno-Noguer,et al.  Learning Depth-Aware Deep Representations for Robotic Perception , 2017, IEEE Robotics and Automation Letters.

[4]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Dieter Fox,et al.  The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation , 2019, CoRL.

[7]  Ruslan Salakhutdinov,et al.  Neural Map: Structured Memory for Deep Reinforcement Learning , 2017, ICLR.

[8]  Jitendra Malik,et al.  On Evaluation of Embodied Navigation Agents , 2018, ArXiv.

[9]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[10]  Lydia E. Kavraki,et al.  Path planning using lazy PRM , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[11]  Sanja Fidler,et al.  VirtualHome: Simulating Household Activities Via Programs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Silvio Savarese,et al.  Neural Task Programming: Learning to Generalize Across Hierarchical Tasks , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Wolfram Burgard,et al.  Hindsight for Foresight: Unsupervised Structured Dynamics Models from Physical Interaction , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[15]  C. Karen Liu,et al.  Assistive Gym: A Physics Simulation Framework for Assistive Robotics , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[16]  S. LaValle Rapidly-exploring random trees : a new tool for path planning , 1998 .

[17]  Henrik I. Christensen,et al.  RGB-D object pose estimation in unstructured environments , 2016, Robotics Auton. Syst..

[18]  Leonidas J. Guibas,et al.  SAPIEN: A SimulAted Part-Based Interactive ENvironment , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Wei Shen,et al.  Weight Standardization , 2019, ArXiv.

[20]  Anelia Angelova,et al.  Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos , 2018, AAAI.

[21]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[22]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[23]  Silvio Savarese,et al.  HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile Manipulators , 2019, CoRL.

[24]  Chuang Gan,et al.  ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation , 2020, ArXiv.

[25]  Pascal Fua,et al.  Segmentation-Driven 6D Object Pose Estimation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Matthias Nießner,et al.  Matterport3D: Learning from RGB-D Data in Indoor Environments , 2017, 2017 International Conference on 3D Vision (3DV).

[27]  Xiaogang Wang,et al.  Shape2Motion: Joint Analysis of Motion Parts and Attributes From 3D Shapes , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Juho Kannala,et al.  CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis , 2019, SCIA.

[29]  Christophe Schlick,et al.  An Inexpensive BRDF Model for Physically‐based Rendering , 1994, Comput. Graph. Forum.

[30]  Chenxi Liu,et al.  Micro-Batch Training with Batch-Channel Normalization and Weight Standardization , 2019 .

[31]  Silvio Savarese,et al.  ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation , 2020, ArXiv.

[32]  Silvio Savarese,et al.  A Behavioral Approach to Visual Navigation with Graph Localization Networks , 2019, Robotics: Science and Systems.

[33]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[34]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35]  Victor Ng-Thow-Hing,et al.  Fast smoothing of manipulator trajectories using optimal bounded-acceleration shortcuts , 2010, 2010 IEEE International Conference on Robotics and Automation.

[36]  Dirk Merkel,et al.  Docker: lightweight Linux containers for consistent development and deployment , 2014 .

[37]  Stephen Tyree,et al.  How to Close Sim-Real Gap? Transfer with Segmentation! , 2020, ArXiv.

[38]  Pieter Abbeel,et al.  DoorGym: A Scalable Door Opening Environment And Baseline Agent , 2019, ArXiv.

[39]  Joseph J. Lim,et al.  IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks , 2019, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[40]  Yasar Ayaz,et al.  Intelligent bidirectional rapidly-exploring random trees for optimal motion planning in complex cluttered environments , 2015, Robotics Auton. Syst..

[41]  Siddhartha S. Srinivasa,et al.  DART: Dynamic Animation and Robotics Toolkit , 2018, J. Open Source Softw..

[42]  Jitendra Malik,et al.  Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Dieter Fox,et al.  Scaling Local Control to Large-Scale Topological Navigation , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Sergey Levine,et al.  Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.

[45]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[46]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[47]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[48]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[49]  Silvio Savarese,et al.  Interactive Gibson Benchmark: A Benchmark for Interactive Navigation in Cluttered Environments , 2019, IEEE Robotics and Automation Letters.

[50]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[51]  Leonidas J. Guibas,et al.  PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Andrew J. Davison,et al.  RLBench: The Robot Learning Benchmark & Learning Environment , 2019, IEEE Robotics and Automation Letters.

[53]  Dieter Fox,et al.  Neural Autonomous Navigation with Riemannian Motion Policy , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[54]  Ali Farhadi,et al.  Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[55]  Sergey Levine,et al.  Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous Flight , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[56]  Leslie Pack Kaelbling,et al.  FFRob: Leveraging symbolic planning for efficient task and motion planning , 2016, Int. J. Robotics Res..

[57]  Silvio Savarese,et al.  Deep Visual MPC-Policy Learning for Navigation , 2019, IEEE Robotics and Automation Letters.

[58]  Roberto Mart'in-Mart'in,et al.  robosuite: A Modular Simulation Framework and Benchmark for Robot Learning , 2020, ArXiv.

[59]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[60]  Ali Farhadi,et al.  AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.

[61]  Dieter Fox,et al.  GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning , 2018, CoRL.

[62]  Leonidas J. Guibas,et al.  Situational Fusion of Visual Representation for Visual Navigation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[63]  Sergey Levine,et al.  Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Jitendra Malik,et al.  Gibson Env: Real-World Perception for Embodied Agents , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[65]  Lin Gao,et al.  3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics , 2020, ArXiv.