Long-Term Memory Episodic Memory Seman � c Memory Procedural Memory Bayesian Op � misa � on Visual Similarity Reasoning Modules Robot learning phase ( without Transfert Learning )

We present a developmental framework based on a long-term memory and reasoning mechanisms (Vision Similarity and Bayesian Optimisation). This architecture allows a robot to optimize autonomously hyper-parameters that need to be tuned from any action and/or vision module, treated as a black-box. The learning can take advantage of past experiences (stored in the episodic and procedural memories) in order to warmstart the exploration using a set of hyper-parameters previously optimized from objects similar to the new unknown one (stored in a semantic memory). As example, the system has been used to optimized 9 continuous hyper-parameters of a professional software (Kamido) both in simulation and with a real robot (industrial robotic arm Fanuc) with a total of 13 different objects. The robot is able to find a good object-specific optimization in 68 (simulation) or 40 (real) trials. In simulation, we demonstrate the benefit of the transfer learning based on visual similarity, as opposed to an amnesic learning (i.e. learning from scratch all the time). Moreover, with the real robot, we show that the method consistently outperforms the manual optimization from an expert with less than 2 hours of training time to achieve more than 88% of success.

[1]  Roland Siegwart,et al.  Flexible Robotic Grasping with Sim-to-Real Transfer based Reinforcement Learning , 2018, ArXiv.

[2]  Sergey Levine,et al.  Learning Flexible and Reusable Locomotion Primitives for a Microrobot , 2018, IEEE Robotics and Automation Letters.

[3]  Bernd Bischl,et al.  mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions , 2017, 1703.03373.

[4]  Andreas Krause,et al.  Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yiannis Demiris,et al.  Towards the emergence of procedural memories from lifelong multi-modal streaming memories for cognitive robots , 2016 .

[7]  Yiannis Demiris,et al.  Lifelong Augmentation of Multimodal Streaming Autobiographical Memories , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[8]  Yiannis Demiris,et al.  Hierarchical action learning by instruction through interactive grounding of body parts and proto-actions , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Alexandre Bernardino,et al.  Unscented Bayesian optimization for safe robot grasping , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Eric Lengyel Volumetric Hierarchical Approximate Convex Decomposition , 2016 .

[11]  Jan Peters,et al.  Bayesian optimization for learning gaits under uncertainty , 2016, Annals of Mathematics and Artificial Intelligence.

[12]  Tomoaki Nakamura,et al.  Symbol emergence in robotics: a survey , 2015, Adv. Robotics.

[13]  Giulio Sandini,et al.  Prospection in Cognition: The Case for Joint Episodic-Procedural Memory in Cognitive Robotics , 2015, Front. Robot. AI.

[14]  Bernd Bischl,et al.  Model-Based Multi-objective Optimization: Taxonomy, Multi-Point Proposal, Toolbox and Benchmark , 2015, EMO.

[15]  Frank Hutter,et al.  Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.

[16]  Guillaume Gibert,et al.  Emergence of the use of pronouns and names in triadic human-robot spoken interaction , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[17]  Antoine Cully,et al.  Robots that can adapt like animals , 2014, Nature.

[18]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Peter Ford Dominey,et al.  Successive Developmental Levels of Autobiographical Memory for Learning Through Social Interaction , 2014, IEEE Transactions on Autonomous Mental Development.

[20]  Gideon S. Mann,et al.  Efficient Transfer Learning Method for Automatic Hyperparameter Tuning , 2014, AISTATS.

[21]  Victor Picheny,et al.  Quantile-Based Optimization of Noisy Computer Experiments With Tunable Precision , 2013, Technometrics.

[22]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[23]  Tony Belpaeme,et al.  A review of long-term memory in natural and synthetic systems , 2012, Adapt. Behav..

[24]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[25]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[26]  Tom Schaul,et al.  Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.

[27]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[28]  Tao Wang,et al.  Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.

[29]  A. Forrester,et al.  Design and analysis of 'noisy' computer experiments , 2006 .

[30]  Petros Koumoutsakos,et al.  Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[31]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[32]  Nikolaus Hansen,et al.  Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[33]  Jonas Mockus,et al.  Application of Bayesian approach to numerical methods of global and stochastic optimization , 1994, J. Glob. Optim..

[34]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[35]  M. Stein Large sample properties of simulations using latin hypercube sampling , 1987 .

[36]  J. Mockus The Bayesian Approach to Local Optimization , 1989 .

[37]  E. Balint Memory and consciousness. , 1987, The International journal of psycho-analysis.