论文信息 - Bayesian Optimization for Developmental Robotics with Meta-Learning by Parameters Bounds Reduction

Bayesian Optimization for Developmental Robotics with Meta-Learning by Parameters Bounds Reduction

In robotics, methods and softwares usually require optimizations of hyperparameters in order to be efficient for specific tasks, for instance industrial bin-picking from homogeneous heaps of different objects. We present a developmental framework based on long-term memory and reasoning modules (Bayesian Optimisation, visual similarity and parameters bounds reduction) allowing a robot to use meta-learning mechanism increasing the efficiency of such continuous and constrained parameters optimizations. The new optimization, viewed as a learning for the robot, can take advantage of past experiences (stored in the episodic and procedural memories) to shrink the search space by using reduced parameters bounds computed from the best optimizations realized by the robot with similar tasks of the new one (e.g. bin-picking from an homogenous heap of a similar object, based on visual similarity of objects stored in the semantic memory). As example, we have confronted the system to the constrained optimizations of 9 continuous hyperparameters for a professional software (Kamido) in industrial robotic arm bin-picking tasks, a step that is needed each time to handle correctly new object. We used a simulator to create bin-picking tasks for 8 different objects (7 in simulation and one with real setup, without and with meta-learning with experiences coming from other similar objects) achieving goods results despite a very small optimization budget, with a better performance reached when meta-learning is used (84.3 % vs 78.9 % of success overall, with a small budget of 30 iterations for each optimization) for every object tested (p-value=0.036).

[1] M. Stein. Large sample properties of simulations using latin hypercube sampling , 1987 .

[2] Rüdiger Dillmann,et al. The KIT object models database: An object model database for object recognition, localization and manipulation in service robotics , 2012, Int. J. Robotics Res..

[3] Antoine Cully,et al. Robots that can adapt like animals , 2014, Nature.

[4] J. Mockus. The Bayesian Approach to Local Optimization , 1989 .

[5] Roland Siegwart,et al. Flexible Robotic Grasping with Sim-to-Real Transfer based Reinforcement Learning , 2018, ArXiv.

[6] Yiannis Demiris,et al. Lifelong Augmentation of Multimodal Streaming Autobiographical Memories , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[7] Roland Siegwart,et al. Comparing Task Simplifications to Learn Closed-Loop Object Picking Using Deep Reinforcement Learning , 2018, IEEE Robotics and Automation Letters.

[8] Dario Floreano,et al. Memetic Viability Evolution for Constrained Optimization , 2016, IEEE Transactions on Evolutionary Computation.

[9] Pierre-Yves Oudeyer,et al. Curiosity Driven Exploration of Learned Disentangled Goal Spaces , 2018, CoRL.

[10] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[11] Emmanuel Dellandréa,et al. Developmental Bayesian Optimization of Black-Box with Visual Similarity-Based Transfer Learning , 2018 .

[12] Victor Picheny,et al. Quantile-Based Optimization of Noisy Computer Experiments With Tunable Precision , 2013, Technometrics.

[13] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[14] Bernd Bischl,et al. mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions , 2017, 1703.03373.

[15] Edward C. van der Meulen,et al. Entropy-Based Tests of Uniformity , 1981 .

[16] Tomoaki Nakamura,et al. Symbol emergence in robotics: a survey , 2015, Adv. Robotics.

[17] Nikolaus Hansen,et al. Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[18] Thomas A. Funkhouser,et al. The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[19] Tao Wang,et al. Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.

[20] Sergey Levine,et al. Learning Flexible and Reusable Locomotion Primitives for a Microrobot , 2018, IEEE Robotics and Automation Letters.

[21] Yiannis Demiris,et al. Hierarchical action learning by instruction through interactive grounding of body parts and proto-actions , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[22] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[23] Peter Ford Dominey,et al. Successive Developmental Levels of Autobiographical Memory for Learning Through Social Interaction , 2014, IEEE Transactions on Autonomous Mental Development.

[24] Jan Peters,et al. Bayesian optimization for learning gaits under uncertainty , 2015, Annals of Mathematics and Artificial Intelligence.

[25] Gideon S. Mann,et al. Efficient Transfer Learning Method for Automatic Hyperparameter Tuning , 2014, AISTATS.

[26] Guillaume Gibert,et al. Emergence of the use of pronouns and names in triadic human-robot spoken interaction , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[27] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..

[28] Jonas Mockus,et al. Application of Bayesian approach to numerical methods of global and stochastic optimization , 1994, J. Glob. Optim..

[29] Eric Lengyel. Volumetric Hierarchical Approximate Convex Decomposition , 2016 .

[30] E. Balint. Memory and consciousness. , 1987, The International journal of psycho-analysis.

[31] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Dario Floreano,et al. Viability Principles for Constrained Optimization Using a (1+1)-CMA-ES , 2014, PPSN.