Exploratory Robotic Controllers : An Evolution and Information Theory Driven Approach. (Exploration Robotique Autonome hybridant : évolution et théorie de l'information)

This thesis is concerned with building autonomous exploratory robotic controllers in an online, on-board approach, with no requirement for ground truth or human intervention in the experimental setting.This study is primarily motivated by autonomous robotics, specifically autonomous robot swarms. In this context, one faces two difficulties. Firstly, standard simulator-based approaches are hardly effective due to computational efficiency and accuracy reasons. On the one hand, the simulator accuracy is hindered by the variability of the hardware; on the other hand, this approach faces a super-linear computational complexity w.r.t. the number of robots in the swarm. Secondly, the standard goal-driven approach used for controller design does not apply as there is no explicit objective function at the individual level, since the objective is defined at the swarm level.A first step toward autonomous exploratory controllers is proposed in the thesis. The Evolution & Information Theory-based Exploratory Robotics (Ev-ITER) approach is based on the hybridization of two approaches stemming from Evolutionary Robotics and from Reinforcement Learning, with the goal of getting the best of both worlds: (i) primary controllers, or crawling controllers, are evolved in order to generate sensori-motor trajectories with high entropy; (ii) the data repository built from the crawling controllers is exploited, providing prior knowledge to secondary controllers, inspired from the intrinsic robust motivation setting and achieving the thorough exploration of the environment.The contributions of the thesis are threefold. Firstly, Ev-ITER fulfills the desired requirement: it runs online, on-board and without requiring any ground truth or support. Secondly, Ev-ITER outperforms both the evolutionary and the information theory-based approaches standalone, in terms of actual exploration of the arena. Thirdly and most importantly, the Ev-ITER controller features some generality property, being able to efficiently explore other arenas than the one considered during the first evolutionary phase. It must be emphasized that the generality of the learned controller with respect to the considered environment has rarely been considered, neither in the reinforcement learning, nor in evolutionary robotics.

[1]  H. Harlow Learning and satiation of response in intrinsically motivated complex puzzle performance by monkeys. , 1950, Journal of comparative and physiological psychology.

[2]  Javier Ruiz-del-Solar,et al.  Combining Simulation and Reality in Evolutionary Robotics , 2007, J. Intell. Robotic Syst..

[3]  Marco Mirolli,et al.  Which is the best intrinsic motivation signal for learning multiple skills? , 2013, Front. Neurorobot..

[4]  Karl J. Friston,et al.  Perceptions as Hypotheses: Saccades as Experiments , 2012, Front. Psychology.

[5]  Christian Igel,et al.  Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search , 2009, ICML '09.

[6]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[7]  Taku Komura,et al.  Topology matching for fully automatic similarity estimation of 3D shapes , 2001, SIGGRAPH.

[8]  Gunnar Rätsch,et al.  Active Learning with Support Vector Machines in the Drug Discovery Process , 2003, J. Chem. Inf. Comput. Sci..

[9]  Lawrence J. Fogel,et al.  Artificial Intelligence through Simulated Evolution , 1966 .

[10]  Daphne Koller,et al.  Expectation Maximization and Complex Duration Distributions for Continuous Time Bayesian Networks , 2005, UAI.

[11]  Oskar von Stryk,et al.  Hardware-in-the-Loop Optimization of the Walking Speed of a Humanoid Robot , 2006 .

[12]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[13]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[14]  Francesco Mondada,et al.  Evolutionary neurocontrollers for autonomous mobile robots , 1998, Neural Networks.

[15]  A. Cooper,et al.  Predictive Reward Signal of Dopamine Neurons , 2011 .

[16]  E de Margerie,et al.  Artificial evolution of the morphology and kinematics in a flapping-wing mini-UAV , 2007, Bioinspiration & biomimetics.

[17]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[18]  Volume 26 , 2002 .

[19]  Gustavo Pessin,et al.  Intelligent control and evolutionary strategies applied to multirobotic systems , 2010, 2010 IEEE International Conference on Industrial Technology.

[20]  Yasuhisa Hasegawa,et al.  Robot hand manipulation by evolutionary programming , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[21]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[22]  Paul J. Layzell,et al.  Explorations in design space: unconventional electronics design through artificial evolution , 1999, IEEE Trans. Evol. Comput..

[23]  Karl Sims,et al.  Evolving virtual creatures , 1994, SIGGRAPH.

[24]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[25]  C. L. Hull Principles of behavior : an introduction to behavior theory , 1943 .

[26]  Aravind Srinivasan,et al.  Innovization: innovating design principles through optimization , 2006, GECCO.

[27]  Ranjan Acharyya,et al.  A New Approach for Blind Source Separation of Convolutive Sources - Wavelet Based Separation Using Shrinkage Function , 2008 .

[28]  Melinda Miller Holt,et al.  Statistics and Data Analysis From Elementary to Intermediate , 2001, Technometrics.

[29]  Eyke Hüllermeier,et al.  Preference-based Evolutionary Direct Policy Search , 2013 .

[30]  A. Ishiguro,et al.  Evolutionary construction of behavior arbitration mechanisms based on dynamically-rearranging neural networks , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[31]  David M. Bradley,et al.  Boosting Structured Prediction for Imitation Learning , 2006, NIPS.

[32]  Hod Lipson,et al.  Evolving Dynamic Gaits on a Physical Robot , 2004 .

[33]  Krister Wolff,et al.  Evolutionary optimization of a bipedal gait in a physical robot , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[34]  Samy Bengio,et al.  A supervised learning approach based on STDP and polychronization in spiking neuron networks , 2007, ESANN.

[35]  Jun Nakanishi,et al.  Operational Space Control: A Theoretical and Empirical Comparison , 2008, Int. J. Robotics Res..

[36]  Jordan B. Pollack,et al.  Automatic design and manufacture of robotic lifeforms , 2000, Nature.

[37]  Thomas Bäck,et al.  Evolutionary computation: Toward a new philosophy of machine intelligence , 1997, Complex..

[38]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[39]  Michèle Sebag,et al.  Programming by Feedback , 2014, ICML.

[40]  Kenneth O. Stanley,et al.  Exploiting Open-Endedness to Solve Problems Through the Search for Novelty , 2008, ALIFE.

[41]  Mi-Ching Tsai,et al.  Robust and Optimal Control , 2014 .

[42]  Jon M. Kleinberg,et al.  An Impossibility Theorem for Clustering , 2002, NIPS.

[43]  A. E. Eiben,et al.  On-Line, On-Board Evolution of Robot Controllers , 2009, Artificial Evolution.

[44]  Akio Ishiguro,et al.  Incremental evolution of neurocontrollers with a diffusion-reaction mechanism of neuromodulators , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[45]  Matthew Studley,et al.  The distributed co-evolution of an embodied simulator and controller for swarm robot behaviours , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[46]  S. Schaal,et al.  Robot juggling: implementation of memory-based learning , 1994, IEEE Control Systems.

[47]  Nicolas Bredeche,et al.  Embedded Evolutionary Robotics: The (1+1)-Restart-Online Adaptation Algorithm , 2011 .

[48]  Michail G. Lagoudakis,et al.  Approximate Policy Iteration using Large-Margin Classifiers , 2003, IJCAI.

[49]  Leonardo Trujillo,et al.  Hybrid back-propagation training with evolutionary strategies , 2014, Soft Comput..

[50]  Peter Auer,et al.  Models for Autonomously Motivated Exploration in Reinforcement Learning - (Extended Abstract) , 2011, ALT.

[51]  A. E. Eiben,et al.  Self-Adaptive Mutation in On-line, On-board Evolutionary Robotics , 2010, 2010 Fourth IEEE International Conference on Self-Adaptive and Self-Organizing Systems Workshop.

[52]  J. Horvitz Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events , 2000, Neuroscience.

[53]  Meng Joo Er,et al.  A survey of inverse reinforcement learning techniques , 2012, Int. J. Intell. Comput. Cybern..

[54]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[55]  Pierre-Yves Oudeyer,et al.  In Search of the Neural Circuits of Intrinsic Motivation , 2007, Front. Neurosci..

[56]  Richard J. Duro,et al.  Self-organizing Robot Teams Using Asynchronous Situated Co-evolution , 2010, SAB.

[57]  T. Back,et al.  Evolutionary algorithms for real world applications [Application Notes] , 2008, IEEE Computational Intelligence Magazine.

[58]  Hod Lipson,et al.  Resilient Machines Through Continuous Self-Modeling , 2006, Science.

[59]  Johannes Fürnkranz,et al.  Preference-Based Reinforcement Learning: A Preliminary Survey , 2013 .

[60]  Phil Husbands,et al.  Artificial Life IX: Proceedings of the Ninth International Conference on the Simulation and Synthesis of Living Systems , 2004 .

[61]  Joel Lehman,et al.  Rewarding Reactivity to Evolve Robust Controllers without Multiple Trials or Noise , 2012, ALIFE.

[62]  Pierre-Yves Oudeyer,et al.  Interactive learning gives the tempo to an intrinsically motivated robot learner , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[63]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[64]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[65]  Pierre-Yves Oudeyer,et al.  Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[66]  David B. Fogel,et al.  Evolutionary Computation: Towards a New Philosophy of Machine Intelligence , 1995 .

[67]  Johannes Fürnkranz,et al.  A Policy Iteration Algorithm for Learning from Preference-Based Feedback , 2013, IDA.

[68]  Kenneth O. Stanley,et al.  Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.

[69]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[70]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[71]  Takashi Onoda,et al.  Estimation of power consumption for household electric appliances , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[72]  Pavel B. Brazdil,et al.  Machine Learning: ECML-93 , 1993, Lecture Notes in Computer Science.

[73]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[74]  Dario Izzo,et al.  Evolutionary robotics approach to odor source localization , 2013, Neurocomputing.

[75]  Edgar Alfredo Portilla-Flores,et al.  Differential evolution techniques for the structure-control design of a five-bar parallel robot , 2010 .

[76]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[77]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[78]  Ezequiel A. Di Paolo,et al.  Crawling Out of the Simulation: Evolving Real Robot Morphologies Using Cheap, Reusable Modules , 2004 .

[79]  Marco Mirolli,et al.  Intrinsically Motivated Learning in Natural and Artificial Systems , 2013 .

[80]  R. Lathe Phd by thesis , 1988, Nature.

[81]  Francesco Mondada,et al.  Automatic creation of an autonomous agent: genetic evolution of a neural-network driven robot , 1994 .

[82]  Bernhard Schölkopf,et al.  Learning strategies in table tennis using inverse reinforcement learning , 2014, Biological Cybernetics.

[83]  Rasit Köker,et al.  A genetic algorithm approach to a neural-network-based inverse kinematics solution of robotic manipulators based on error minimization , 2013, Inf. Sci..

[84]  A. Clark,et al.  Artificial Intelligence: The Very Idea. , 1988 .

[85]  T. Gomi,et al.  Evolution of gaits of a legged robot , 1998, 1998 IEEE International Conference on Fuzzy Systems Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36228).

[86]  Eyke Hüllermeier,et al.  Preference Learning , 2005, Künstliche Intell..

[87]  Eyke Hllermeier,et al.  Preference Learning , 2010 .

[88]  Alan Fern,et al.  A Bayesian Approach for Policy Learning from Trajectory Preference Queries , 2012, NIPS.

[89]  Pierre-Yves Oudeyer,et al.  Information-seeking, curiosity, and attention: computational and neural mechanisms , 2013, Trends in Cognitive Sciences.

[90]  Matthew Schlesinger,et al.  Investigating the Origins of Intrinsic Motivation in Human Infants , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[91]  Pierre-Yves Oudeyer,et al.  Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress , 2012, NIPS.

[92]  Jun Zhang,et al.  Evolutionary Computation Meets Machine Learning: A Survey , 2011, IEEE Computational Intelligence Magazine.

[93]  Jeffrey L. Krichmar,et al.  Sensor-rich robots driven by real-time brain circuit algorithms , 2011 .

[94]  Balázs Kégl,et al.  Automatic Machine Learning (AutoML) , 2015, ICML 2015.

[95]  Stephen Hart,et al.  Learning Generalizable Control Programs , 2011, IEEE Transactions on Autonomous Mental Development.

[96]  Kee-Eung Kim,et al.  Reward Shaping for Model-Based Bayesian Reinforcement Learning , 2015, AAAI.

[97]  Kalyanmoy Deb,et al.  Higher-level innovization: A case study from Friction Stir Welding process optimization , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[98]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[99]  Eliseo Ferrante,et al.  Swarm robotics: a review from the swarm engineering perspective , 2013, Swarm Intelligence.

[100]  Aravind Srinivasan,et al.  Innovization: Discovery of Innovative Design Principles Through Multiobjective Evolutionary Optimization , 2008, Multiobjective Problem Solving from Nature.

[101]  Ole Ravn,et al.  Differential evolution to enhance localization of mobile robots , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[102]  Pier Luca Lanzi,et al.  Proceedings of the 13th annual conference companion on Genetic and evolutionary computation , 2011, GECCO 2011.

[103]  Thomas Bräunl,et al.  Evolving Autonomous Biped Control from Simulation to Reality , 2004 .

[104]  Masahiro Fujita,et al.  Evolving robust gaits with AIBO , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[105]  Pierre-Yves Oudeyer,et al.  Motivational principles for visual know-how development , 2003 .

[106]  Darwin G. Caldwell,et al.  Reinforcement Learning in Robotics: Applications and Real-World Challenges , 2013, Robotics.

[107]  Pierre-Yves Oudeyer,et al.  How can we define intrinsic motivation , 2008 .

[108]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[109]  Andrew S. Glassner,et al.  Proceedings of the 27th annual conference on Computer graphics and interactive techniques , 1994, SIGGRAPH 1994.

[110]  Paul Brown,et al.  Implicit Fitness Functions for Evolving a Drawing Robot , 2008, EvoWorkshops.

[111]  Matthieu Geist,et al.  Around Inverse Reinforcement Learning and Score-based Classification , 2013 .

[112]  Michèle Sebag,et al.  Preference-Based Policy Learning , 2011, ECML/PKDD.

[113]  Julie Clacy,et al.  In line. , 1988, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[114]  Volume 40 , 1990 .

[115]  Cédric Hartland,et al.  Evolutionary Robotics, Anticipation and the Reality Gap , 2006, 2006 IEEE International Conference on Robotics and Biomimetics.

[116]  Rahul Kala,et al.  Multi-robot path planning using co-evolutionary genetic programming , 2012, Expert Syst. Appl..

[117]  Aude Billard,et al.  From Animals to Animats , 2004 .

[118]  Maughan S. Mason Fun! , 1977 .

[119]  Jean-Marc Montanier,et al.  Emergence of altruism in open-ended evolution in a population of autonomous agents , 2011, GECCO.

[120]  Jürgen Schmidhuber,et al.  Evolving large-scale neural networks for vision-based reinforcement learning , 2013, GECCO '13.

[121]  J. Panksepp Affective Neuroscience: The Foundations of Human and Animal Emotions , 1998 .

[122]  J. Kagan Motives and development. , 1972, Journal of personality and social psychology.

[123]  Stefano Nolfi,et al.  How to Evolve Autonomous Robots: Different Approaches in Evolutionary Robotics , 1994 .

[124]  Juyang Weng,et al.  Motivational System for Human-Robot Interaction , 2004, ECCV Workshop on HCI.

[125]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[126]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[127]  S. Crawford,et al.  Volume 1 , 2012, Journal of Diabetes Investigation.

[128]  R. W. White Motivation reconsidered: the concept of competence. , 1959, Psychological review.

[129]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[130]  E. Deci,et al.  Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.

[131]  Erik D. Goodman,et al.  Genetic Programming-Based Automatic Gait Generation in Joint Space for a Quadruped Robot , 2010, Adv. Robotics.

[132]  Melanie E. Moses,et al.  From Microbiology to Microcontrollers: Robot Search Patterns Inspired by T Cell Movement , 2013, ECAL.

[133]  Dario Floreano,et al.  Evolving Vision-Based Flying Robots , 2002, Biologically Motivated Computer Vision.

[134]  L. Miles,et al.  2000 , 2000, RDH.

[135]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[136]  Jean-Marc Montanier,et al.  Environment-driven Distributed Evolutionary Adaptation for Collective Robotic Systems. (Evolution Artificielle pour la Robotique Collective en Environnement Ouvert) , 2013 .

[137]  Johannes Fürnkranz,et al.  EPMC: Every Visit Preference Monte Carlo for Reinforcement Learning , 2013, ACML.

[138]  Wolfgang Banzhaf,et al.  Generating Adaptive Behavior using Function Regression within Genetic Programming and a Real Robot , 1997 .

[139]  Pierre-Yves Oudeyer,et al.  R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.

[140]  Philippe Capdepuy,et al.  Maximization of Potential Information Flow as a Universal Utility for Collective Behaviour , 2007, 2007 IEEE Symposium on Artificial Life.

[141]  G. Palli Intelligent Robots And Systems , 1993, Proceedings of 1993 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '93).

[142]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[143]  Stéphane Doncieux,et al.  The Transferability Approach: Crossing the Reality Gap in Evolutionary Robotics , 2013, IEEE Transactions on Evolutionary Computation.

[144]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[145]  Hod Lipson,et al.  Evolutionary Robotics for Legged Machines: From Simulation to Physical Reality , 2006, IAS.

[146]  John J. Grefenstette,et al.  RoboShepherd: Learning a complex behavior , 1996 .

[147]  Pierre-Yves Oudeyer,et al.  The Playground Experiment: Task-Independent Development of a Curious Robot , 2005 .

[148]  R. Reyment,et al.  Statistics and Data Analysis in Geology. , 1988 .

[149]  Hyun Myung,et al.  Robot Intelligence Technology and Applications 2012 , 2013 .

[150]  Efrén Mezura-Montes,et al.  Robotic Behavior Implementation Using Two Different Differential Evolution Variants , 2012, MICAI.

[151]  J. Andrew Bagnell,et al.  Maximum margin planning , 2006, ICML.

[152]  M. Csíkszentmihályi Beyond boredom and anxiety , 1975 .

[153]  Marc Toussaint,et al.  Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.

[154]  Dario Floreano,et al.  Active vision and feature selection in evolutionary behavioral systems , 2002 .

[155]  Xiao Huang,et al.  Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .

[156]  Stefan Roth,et al.  Covariance Matrix Adaptation for Multi-objective Optimization , 2007, Evolutionary Computation.

[157]  J. van Leeuwen,et al.  Evolvable Systems: From Biology to Hardware , 2002, Lecture Notes in Computer Science.

[158]  Black Jack,et al.  Volume 13 , 2004, Environmental Biology of Fishes.

[159]  William W. Cohen,et al.  Proceedings of the 23rd international conference on Machine learning , 2006, ICML 2008.

[160]  Eyke Hüllermeier,et al.  Preference-based reinforcement learning: a formal framework and a policy iteration algorithm , 2012, Mach. Learn..

[161]  Nuttapong Chentanez,et al.  Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[162]  Sanyou Zeng,et al.  Evolvable Systems: From Biology to Hardware, 7th International Conference, ICES 2007, Wuhan, China, September 21-23, 2007, Proceedings , 2007, ICES.

[163]  Michèle Sebag,et al.  Coupling Evolution and Information Theory for Autonomous Robotic Exploration , 2014, PPSN.

[164]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[165]  Masahiro Fujita,et al.  Evolution of Controllers from a High-Level Simulator to a High DOF Robot , 2000, ICES.

[166]  Qingfu Zhang,et al.  Multiobjective evolutionary algorithms: A survey of the state of the art , 2011, Swarm Evol. Comput..

[167]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[168]  H. Kalmus Biological Cybernetics , 1972, Nature.

[169]  Marc Schoenauer,et al.  Preference-based Reinforcement Learning , 2011 .

[170]  Andrew G. Barto,et al.  Transfer in Reinforcement Learning via Shared Features , 2012, J. Mach. Learn. Res..

[171]  Jordan B. Pollack,et al.  Embodied evolution: embodying an evolutionary algorithm in a population of robots , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[172]  A. Gray,et al.  I. THE ORIGIN OF SPECIES BY MEANS OF NATURAL SELECTION , 1963 .

[173]  Anthony Kulis,et al.  Bio-Inspired Artificial Intelligence: Theories, Methods, and Technologies , 2009, Scalable Comput. Pract. Exp..

[174]  Gregory J. Barlow,et al.  Article in Press Robotics and Autonomous Systems ( ) – Robotics and Autonomous Systems Fitness Functions in Evolutionary Robotics: a Survey and Analysis , 2022 .

[175]  I. T. Van der Spek,et al.  Imitation learning for a robotic precision placement task , 2014 .

[176]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[177]  Terence Soule,et al.  Proceedings of the 14th annual conference companion on Genetic and evolutionary computation , 2012, GECCO 2012.

[178]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[179]  Richard S. Sutton,et al.  Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..

[180]  Paul J. Werbos,et al.  Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[181]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[182]  Stéphane Doncieux,et al.  Behavioral diversity measures for Evolutionary Robotics , 2010, IEEE Congress on Evolutionary Computation.

[183]  Benjamin Van Roy,et al.  Generalization and Exploration via Randomized Value Functions , 2014, ICML.

[184]  Dimitri Van De Ville,et al.  Machine Learning with Brain Graphs: Predictive Modeling Approaches for Functional Imaging in Systems Neuroscience , 2013, IEEE Signal Processing Magazine.

[185]  Jeffrey L. Krichmar,et al.  Evolutionary robotics: The biology, intelligence, and technology of self-organizing machines , 2001, Complex..

[186]  K. Montgomery The role of the exploratory drive in learning. , 1954, Journal of comparative and physiological psychology.

[187]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[188]  Lincoln Smith,et al.  Evolving teamwork and role-allocation with real robots , 2002 .

[189]  Pramodita Sharma 2012 , 2013, Les 25 ans de l’OMC: Une rétrospective en photos.

[190]  G. Baldassarre,et al.  Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[191]  A. E. Eiben,et al.  An algorithm for distributed on-line, on-board evolutionary robotics , 2011, GECCO '11.

[192]  Jürgen Schmidhuber,et al.  Evolving deep unsupervised convolutional networks for vision-based reinforcement learning , 2014, GECCO.

[193]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[194]  P. Laskov,et al.  Intrusion Detection in Unlabeled Data with Quarter-sphere Support Vector Machines , 2004, Prax. Inf.verarb. Kommun..

[195]  Ashutosh Saxena,et al.  Learning 3-D object orientation from images , 2009, 2009 IEEE International Conference on Robotics and Automation.

[196]  Olaf Sporns,et al.  Evolving Coordinated Behavior by Maximizing Information Structure , 2006 .

[197]  John Langford,et al.  Learning to Search Better than Your Teacher , 2015, ICML.

[198]  Hod Lipson,et al.  Once More Unto the Breach1: Co-evolving a robot and its simulator , 2004 .

[199]  Luc Steels,et al.  The artificial life route to artificial intelligence : building embodied , 1995 .

[200]  Nick Jakobi,et al.  Evolutionary Robotics and the Radical Envelope-of-Noise Hypothesis , 1997, Adapt. Behav..

[201]  Nigel Morgan,et al.  A Critical Inquiry , 2010 .

[202]  Edward L. Deci,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[203]  Dario Floreano,et al.  Evolution of Plastic Control Networks , 2001, Auton. Robots.

[204]  J. Spalek,et al.  Use of context blocks in genetic programming for evolution of robot morphology , 2012, 2012 ELEKTRO.

[205]  Pierre-Yves Oudeyer,et al.  Intrinsically motivated goal exploration for active motor learning in robots: A case study , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[206]  Will N. Browne,et al.  Integration of Learning Classifier Systems with simultaneous localisation and mapping for autonomous robotics , 2012, 2012 IEEE Congress on Evolutionary Computation.

[207]  A. Tamhane,et al.  Statistics and Data Analysis: From Elementary to Intermediate , 1999 .

[208]  Larry Bull,et al.  TCS Learning Classifier System Controller on a Real Robot , 2002, PPSN.

[209]  Stéphane Doncieux,et al.  Encouraging Behavioral Diversity in Evolutionary Robotics: An Empirical Study , 2012, Evolutionary Computation.

[210]  Nikolaus Hansen,et al.  Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[211]  Warren B. Powell,et al.  AI, OR and Control Theory: A Rosetta Stone for Stochastic Optimization , 2012 .

[212]  W. Wang,et al.  Multi-Behaviour Robot Control using Genetic Network Programming with Fuzzy Reinforcement Learning , 2014, RiTA.

[213]  J. Newman Affective Neuroscience: The Foundations of Human and Animal Emotions. , 2000 .

[214]  Andrew Y. Ng,et al.  Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[215]  P. N. Suganthan,et al.  Differential Evolution: A Survey of the State-of-the-Art , 2011, IEEE Transactions on Evolutionary Computation.

[216]  Michael H. Bowling,et al.  Apprenticeship learning using linear programming , 2008, ICML '08.

[217]  Stéphane Doncieux,et al.  Evolutionary Algorithms to Analyse and Design a Controller for a Flapping Wings Aircraft , 2011 .

[218]  Intelligence , 1836, The Medico-chirurgical review.

[219]  C M Harris,et al.  Curiosity , 1986, Journal of the Royal Society of Medicine.

[220]  M. Gribaudo,et al.  2002 , 2001, Cell and Tissue Research.

[221]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[222]  Peter Nordin,et al.  An On-Line Method to Evolve Behavior and to Control a Miniature Robot in Real Time with Genetic Programming , 1996, Adapt. Behav..

[223]  R. Decharms Personal causation : the internal affective determinants of behavior , 1968 .

[224]  Cédric Hartland A contribution to robust adaptive robotic control acquisition , 2009 .

[225]  Pierre-Yves Oudeyer,et al.  Self-organization of early vocal development in infants and machines: the role of intrinsic motivation , 2014, Front. Psychol..

[226]  Zoubin Ghahramani,et al.  Proceedings of the 24th international conference on Machine learning , 2007, ICML 2007.

[227]  Nick Jacobi,et al.  Running Across the Reality Gap: Octopod Locomotion Evolved in a Minimal Simulation , 1998, EvoRobot.

[228]  Phil Husbands,et al.  Once More Unto the Breach: Co-evolving a robot and its simulator , 2004 .

[229]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[230]  Enrique Alba,et al.  Proceedings of the 15th annual conference on Genetic and evolutionary computation , 2013, GECCO 2013.

[231]  Frank Hoffmann,et al.  Evolutionary Learning of Fuzzy Control Rule Base for an Autonomous Vehicle , 1996 .

[232]  G. Di Chiara Drug addiction as dopamine-dependent associative learning disorder. , 1999, European journal of pharmacology.

[233]  Dario Floreano,et al.  Reverse-engineering of artificially evolved controllers for swarms of robots , 2009, 2009 IEEE Congress on Evolutionary Computation.

[234]  John N. Tsitsiklis,et al.  Neuro-dynamic programming: an overview , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[235]  Pierre-Yves Oudeyer,et al.  What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.

[236]  Sinan Yildirim,et al.  An Online Expectation-Maximisation Algorithm for Nonnegative Matrix Factorisation Models , 2014, ArXiv.

[237]  Daniel Andresen,et al.  Using implicit fitness functions for genetic algorithm-based agent scheduling , .

[238]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[239]  Robert D. White,et al.  Experimental verification of soft-robot gaits evolved using a lumped dynamic model , 2011, Robotica.

[240]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[241]  Jonathan P. How,et al.  Bayesian Nonparametric Inverse Reinforcement Learning , 2012, ECML/PKDD.

[242]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[243]  C. Darwin The Origin of Species by Means of Natural Selection, Or, The Preservation of Favoured Races in the Struggle for Life , 1859 .

[244]  Georg Martius,et al.  From Animals to Animats 11 , 2010 .

[245]  Inman Harvey,et al.  Noise and the Reality Gap: The Use of Simulation in Evolutionary Robotics , 1995, ECAL.

[246]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[247]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[248]  Yusen Zhan,et al.  Online Transfer Learning in Reinforcement Learning Domains , 2015, AAAI Fall Symposia.

[249]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[250]  H. B. Barlow,et al.  Unsupervised Learning , 1989, Neural Computation.

[251]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[252]  R. S. Sutton,et al.  Dynamic switching and real-time machine learning for improved human control of assistive biomedical robots , 2012, 2012 4th IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob).

[253]  Stéphane Doncieux,et al.  Automatic system identification based on coevolution of models and tests , 2009, 2009 IEEE Congress on Evolutionary Computation.

[254]  Michèle Sebag,et al.  Open-Ended Evolutionary Robotics: An Information Theoretic Approach , 2010, PPSN.

[255]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[256]  Thomas Bäck,et al.  An Overview of Evolutionary Computation , 1993, ECML.

[257]  Chun Zhang,et al.  Colias: An Autonomous Micro Robot for Swarm Robotic Applications , 2014 .

[258]  Stefano Nolfi,et al.  Evolving Mobile Robots in Simulated and Real Environments , 1995, Artificial Life.

[259]  Peter Auer,et al.  Autonomous Exploration For Navigating In MDPs , 2012, COLT.

[260]  Fang Sheng,et al.  Genetic algorithm and simulated annealing for optimal robot arm PID control , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[261]  Rolf Pfeifer,et al.  Interacting with the real world: design principles for intelligent systems , 2004, Artificial Life and Robotics.

[262]  Andrew McCallum,et al.  Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.

[263]  Stéphane Doncieux,et al.  New Horizons in Evolutionary Robotics: Extended Contributions from the 2009 EvoDeRob Workshop , 2011 .

[264]  Manuela M. Veloso,et al.  An evolutionary approach to gait learning for four-legged robots , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[265]  Dock Bumpers,et al.  Volume 2 , 2005, Proceedings of the Ninth International Conference on Computer Supported Cooperative Work in Design, 2005..

[266]  Dave Cliff,et al.  Challenges in evolving controllers for physical robots , 1996, Robotics Auton. Syst..

[267]  A. Auger Convergence results for the ( 1 , )-SA-ES using the theory of-irreducible Markov chains , 2005 .

[268]  Juan R. Castro,et al.  Genetic Algorithm Optimization for Type-2 Non-singleton Fuzzy Logic Controllers , 2014, Recent Advances on Hybrid Approaches for Designing Intelligent Systems.

[269]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .