On Assessing the Usefulness of Proxy Domains for Developing and Evaluating Embodied Agents

In many situations it is either impossible or impractical to develop and evaluate agents entirely on the target domain on which they will be deployed. This is particularly true in robotics, where doing experiments on hardware is much more arduous than in simulation. This has become arguably more so in the case of learning-based agents. To this end, considerable recent effort has been devoted to developing increasingly realistic and higher fidelity simulators. However, we lack any principled way to evaluate how good a "proxy domain" is, specifically in terms of how useful it is in helping us achieve our end objective of building an agent that performs well in the target domain. In this work, we investigate methods to address this need. We begin by clearly separating two uses of proxy domains that are often conflated: 1) their ability to be a faithful predictor of agent performance and 2) their ability to be a useful tool for learning. In this paper, we attempt to clarify the role of proxy domains and establish new proxy usefulness (PU) metrics to compare the usefulness of different proxy domains. We propose the relative predictive PU to assess the predictive ability of a proxy domain and the learning PU to quantify the usefulness of a proxy as a tool to generate learning data. Furthermore, we argue that the value of a proxy is conditioned on the task that it is being used to help solve. We demonstrate how these new metrics can be used to optimize parameters of the proxy domain for which obtaining ground truth via system identification is not trivial.

[1]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[2]  Stéphane Doncieux,et al.  Crossing the Reality Gap: a Short Introduction to the Transferability Approach , 2013, ArXiv.

[3]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Sonia Chernova,et al.  Are We Making Real Progress in Simulated Environments? Measuring the Sim2Real Gap in Embodied Visual Navigation , 2019, ArXiv.

[5]  Stéphane Doncieux,et al.  The Transferability Approach: Crossing the Reality Gap in Evolutionary Robotics , 2013, IEEE Transactions on Evolutionary Computation.

[6]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Peter Stone,et al.  Stochastic Grounded Action Transformation for Robot Learning in Simulation , 2017, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Emilio Frazzoli,et al.  Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Yevgen Chebotar,et al.  Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[10]  Lawrence M. Seiford,et al.  An axiomatic approach to distance on partial orderings , 1986 .

[11]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[12]  M.,et al.  On Mimicking the Effects of the Reality Gap with Simulation-only Experiments ? , 2018 .

[13]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yoshua Bengio,et al.  A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[15]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[16]  Cordelia Schmid,et al.  Learning to Augment Synthetic Images for Sim2Real Policy Transfer , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[18]  Pierre-Yves Oudeyer,et al.  Sim-to-Real Transfer with Neural-Augmented Robot Simulation , 2018, CoRL.

[19]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[20]  Sergey Levine,et al.  One-Shot Visual Imitation Learning via Meta-Learning , 2017, CoRL.

[21]  Sergey Levine,et al.  Adapting Deep Visuomotor Representations with Weak Pairwise Constraints , 2015, WAFR.

[22]  Frédo Durand,et al.  DiffTaichi: Differentiable Programming for Physical Simulation , 2020, ICLR.

[23]  Sanja Fidler,et al.  gradSim: Differentiable simulation for system identification and visuomotor control , 2021, ICLR.

[24]  Michael Milford,et al.  Adversarial discriminative sim-to-real transfer of visuo-motor policies , 2017, Int. J. Robotics Res..

[25]  Jonathan P. How,et al.  Reinforcement learning with multi-fidelity simulators , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[26]  C. Karen Liu,et al.  Policy Transfer with Strategy Optimization , 2018, ICLR.

[27]  Byron Boots,et al.  Simulation-based design of dynamic controllers for humanoid balancing , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[28]  Emilio Frazzoli,et al.  The AI Driving Olympics at NeurIPS 2018 , 2019, NeurIPS 2020.

[29]  Wojciech Zaremba,et al.  Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.

[30]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[31]  G. Vinnicombe Frequency domain uncertainty and the graph topology , 1993, IEEE Trans. Autom. Control..

[32]  C. Scrapper,et al.  Robot simulation physics validation , 2007, PerMIS.

[33]  Sergey Levine,et al.  Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Angela P. Schoellig,et al.  Experience Selection Using Dynamics Similarity for Efficient Multi-Source Transfer Learning Between Robots , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Christopher Joseph Pal,et al.  Active Domain Randomization , 2019, CoRL.

[36]  Sonia Chernova,et al.  Sim2Real Predictivity: Does Evaluation in Simulation Predict Real-World Performance? , 2019, IEEE Robotics and Automation Letters.

[37]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[38]  Silvio Savarese,et al.  Gibson Env V2: Embodied Simulation Environments for Interactive Navigation , 2019 .

[39]  David Howard,et al.  Traversing the Reality Gap via Simulator Tuning , 2020, ArXiv.

[40]  Jonathan P. How,et al.  Duckietown: An open, inexpensive and flexible platform for autonomy education and research , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[41]  Sergey Levine,et al.  One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning , 2018, Robotics: Science and Systems.

[42]  Jean-Baptiste Mouret,et al.  20 years of reality gap: a few thoughts about simulators in evolutionary robotics , 2017, GECCO.