论文信息 - CHALET: Cornell House Agent Learning Environment

CHALET: Cornell House Agent Learning Environment

We present CHALET, a 3D house simulator with support for navigation and manipulation. CHALET includes 58 rooms and 10 house configuration, and allows to easily create new house and room layouts. CHALET supports a range of common household activities, including moving objects, toggling appliances, and placing objects inside closeable containers. The environment and actions available are designed to create a challenging domain to train and evaluate autonomous agents, including for tasks that combine language, vision, and planning in a dynamic environment.

[1] Yoav Artzi,et al. A Corpus of Natural Language for Visual Reasoning , 2017, ACL.

[2] Stefan Lee,et al. Embodied Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[4] Ashutosh Saxena,et al. Environment-Driven Lexicon Induction for High-Level Instructions , 2015, ACL.

[5] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.

[6] Thomas A. Funkhouser,et al. MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments , 2017, ArXiv.

[7] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.

[8] Wojciech Jaskowski,et al. ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[9] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[10] Alejandro Hernández Cordero,et al. Extending the OpenAI Gym for robotics: a toolkit for reinforcement learning using ROS and Gazebo , 2016, ArXiv.

[11] Yuandong Tian,et al. Building Generalizable Agents with a Realistic and Rich 3D Environment , 2018, ICLR.

[12] Demis Hassabis,et al. Grounded Language Learning in a Simulated 3D World , 2017, ArXiv.

[13] John Langford,et al. Mapping Instructions and Visual Observations to Actions with Reinforcement Learning , 2017, EMNLP.

[14] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[15] Daniel Marcu,et al. Natural Language Communication with Robots , 2016, NAACL.

[16] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Simon Brodeur,et al. HoME: a Household Multimodal Environment , 2017, ICLR.

[18] Ali Farhadi,et al. IQA: Visual Question Answering in Interactive Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19] Regina Barzilay,et al. Representation Learning for Grounded Spatial Reasoning , 2017, TACL.

[20] Qi Wu,et al. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.