论文信息 - Representation Matters: Improving Perception and Exploration for Robotics

Representation Matters: Improving Perception and Exploration for Robotics

Projecting high-dimensional environment observations into lower-dimensional structured representations can considerably improve data-efficiency for reinforcement learning in domains with limited data such as robotics. Can a single generally useful representation be found? In order to answer this question, it is important to understand how the representation will be used by the agent and what properties such a good representation should have. In this paper we systematically evaluate a number of common learnt and hand-engineered representations in the context of three robotics tasks: lifting, stacking and pushing of 3D blocks. The representations are evaluated in two use-cases: as input to the agent, or as a source of auxiliary tasks. Furthermore, the value of each representation is evaluated in terms of three properties: dimensionality, observability and disentanglement. We can significantly improve performance in both use-cases and demonstrate that some representations can perform commensurate to simulator states as agent inputs. Finally, our results challenge common intuitions by demonstrating that: 1) dimensionality strongly matters for task generation, but is negligible for inputs, 2) observability of task-relevant aspects mostly affects the input representation use-case, and 3) disentanglement leads to better auxiliary tasks, but has only limited benefits for input representations. This work serves as a step towards a more systematic understanding of what makes a good representation for control in robotics, enabling practitioners to make more informed choices for developing new learned or hand-engineered representations.

[1] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.

[2] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.

[3] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[4] Sergey Levine,et al. Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.

[5] David Pfau,et al. Towards a Definition of Disentangled Representations , 2018, ArXiv.

[6] Raia Hadsell,et al. Disentangled Cumulants Help Successor Representations Transfer to New Tasks , 2019, ArXiv.

[7] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[8] Christopher Burgess,et al. DARLA: Improving Zero-Shot Transfer in Reinforcement Learning , 2017, ICML.

[9] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10] Jason Weston,et al. Curriculum learning , 2009, ICML '09.

[11] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..

[12] Martin A. Riedmiller,et al. Regularized Hierarchical Policies for Compositional Transfer in Robotics , 2019, ArXiv.

[13] Rui Wang,et al. Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions , 2019, ArXiv.

[14] Bernhard Schölkopf,et al. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[15] Heikki Mannila,et al. Random projection in dimensionality reduction: applications to image and text data , 2001, KDD '01.

[16] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[17] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.

[18] Michal Valko,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[19] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.

[20] Martin A. Riedmiller,et al. Autonomous reinforcement learning on raw visual input data in a real world application , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[21] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.

[22] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..

[23] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.

[24] Chrystopher L. Nehaniv,et al. All Else Being Equal Be Empowered , 2005, ECAL.

[25] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[26] Daan Wierstra,et al. Towards Conceptual Compression , 2016, NIPS.

[27] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.

[28] Joelle Pineau,et al. Independently Controllable Features , 2017 .

[29] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30] Jeff Donahue,et al. Large Scale Adversarial Representation Learning , 2019, NeurIPS.

[31] Ankush Gupta,et al. Unsupervised Learning of Object Keypoints for Perception and Control , 2019, NeurIPS.

[32] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.

[33] Vladlen Koltun,et al. Learning to Act by Predicting the Future , 2016, ICLR.

[34] Matthew Botvinick,et al. MONet: Unsupervised Scene Decomposition and Representation , 2019, ArXiv.

[35] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.

[36] Martin A. Riedmiller,et al. Compositional Transfer in Hierarchical Reinforcement Learning , 2019, Robotics: Science and Systems.

[37] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[38] R Devon Hjelm,et al. Data-Efficient Reinforcement Learning with Momentum Predictive Representations , 2020, ArXiv.

[39] Martin A. Riedmiller,et al. Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models , 2019, CoRL.

[40] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.

[41] Sergey Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.