Benchmarking Unsupervised Representation Learning for Continuous Control

We address the problem of learning reusable state representations from a non-stationary stream of high-dimensional observations. This is important for areas that employ Reinforcement Learning (RL), which yields non-stationary data distributions during training. Unsupervised approaches can be trained on such data streams to produce low-dimensional latent embeddings, which could be reused on domains with different dynamics and rewards. However, there is a need to adequately evaluate the quality of the resulting representations. We propose an evaluation suite that measures alignment between the learned latent states and the true low-dimensional states. Using this suite, we benchmark several widely used unsupervised learning approaches. This uncovers the strengths and limitations of existing approaches that impose additional constraints/assumptions on the latent space.

[1]  Joelle Pineau,et al.  Spatially Invariant Unsupervised Object Detection with Convolutional Neural Networks , 2019, AAAI.

[2]  Yoshua Bengio,et al.  Unsupervised State Representation Learning in Atari , 2019, NeurIPS.

[3]  Bernard Ghanem,et al.  Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications , 2017, International Journal of Computer Vision.

[4]  Stephan Mandt,et al.  Disentangled Sequential Autoencoder , 2018, ICML.

[5]  Klaus Greff,et al.  Multi-Object Representation Learning with Iterative Variational Inference , 2019, ICML.

[6]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.

[7]  Sungjin Ahn,et al.  SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition , 2020, ICLR.

[8]  David Filliat,et al.  S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation Learning , 2018, ArXiv.

[9]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[10]  Peter Henderson,et al.  An Introduction to Deep Reinforcement Learning , 2018, Found. Trends Mach. Learn..

[11]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[12]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[13]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[14]  Arno Solin,et al.  Pioneer Networks: Progressively Growing Generative Autoencoder , 2018, ACCV.

[15]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[16]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Geoffrey E. Hinton,et al.  Attend, Infer, Repeat: Fast Scene Understanding with Generative Models , 2016, NIPS.

[18]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[19]  Alberto Garcia-Garcia,et al.  UnrealROX: an extremely photorealistic virtual reality environment for robotics simulations and synthetic data generation , 2018, Virtual Reality.