Active Learning and Intrinsically Motivated Exploration in Robots : Advances and Challenges

L EARNING techniques are increasingly being used in today’s complex robotic systems. Robots are expected to deal with a large variety of tasks using their high-dimensional and complex bodies, to manipulate objects and also, to interact with humans in an intuitive and friendly way. In this new setting, not all relevant information is available at design time, and robots should typically be able to learn, through self-experimentation or through human–robot interaction, how to tune their innate perceptual-motor skills or to learn, cumulatively, novel skills that were not preprogrammed initially. In a word, robots need to have the capacity to develop in an open-ended manner and in an open-ended environment, in a way that is analogous to human development which combines genetic and epigenetic factors. This challenge is at the center of the developmental robotics field [7], [35]–[37]. Among the various technical challenges that are raised by these issues, exploration is paramount. Self-experimentation and learning by interacting with the physical and social world is essential to acquire new knowledge and skills. Exploration in physical robotic agents poses challenging problems due to the high-dimensional and complex dynamics of the body–environment system (especially when other agents are part of the environment), and to the open-ended nature of the sensory-motor spaces of real environments. Typically, in those spaces, the lack of adequate constraints in exploration strategies can result in at best very slow learning, and most often in no consistent learning that can even be dangerous or destructive. This special issue1 provides a discussion forum and presents novel contributions on intentional exploration, i.e., internal mechanisms and constraints that explicitly foster organized exploration. Of central interest are mechanisms that push agents to select actions that allow them to gain maximal knowledge or maximal control/competence over their bodies and environments. To this, end different fields suggested different approaches, ranging from operational heuristics that indirectly target maximal information gain, to theoretically

[1]  R. W. White Motivation reconsidered: the concept of competence. , 1959, Psychological review.

[2]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[3]  Heinz von Foerster,et al.  On Constructing a Reality , 2015, Environmental Design Research.

[4]  Edward L. Deci,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[5]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[6]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[7]  Jürgen Schmidhuber,et al.  A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[8]  Sebastian Thrun,et al.  Active Exploration in Dynamic Environments , 1991, NIPS.

[9]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[10]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[11]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[12]  Sebastian Thrun,et al.  Exploration in active learning , 1998 .

[13]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[14]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[15]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[16]  Xiao Huang,et al.  Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .

[17]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[18]  Linda B. Smith,et al.  Development as a dynamic system , 1992, Trends in Cognitive Sciences.

[19]  Giulio Sandini,et al.  Developmental robotics: a survey , 2003, Connect. Sci..

[20]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[21]  Nuttapong Chentanez,et al.  Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[22]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[23]  Philippe Gaussier,et al.  Learning Invariant Sensorimotor Behaviors: A Developmental Approach to Imitation Mechanisms , 2004, Adapt. Behav..

[24]  Pierre-Yves Oudeyer,et al.  The Playground Experiment: Task-Independent Development of a Curious Robot , 2005 .

[25]  Deepak Kumar,et al.  BRINGING UP ROBOT: FUNDAMENTAL MECHANISMS FOR CREATING A SELF-MOTIVATED, SELF-ORGANIZING ARCHITECTURE , 2005, Cybern. Syst..

[26]  Jürgen Schmidhuber,et al.  Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts , 2005 .

[27]  P. Redgrave,et al.  The short-latency dopamine signal: a role in discovering novel actions? , 2006, Nature Reviews Neuroscience.

[28]  David S. Leslie,et al.  Generalised weakened fictitious play , 2006, Games Econ. Behav..

[29]  R. Pfeifer,et al.  Self-Organization, Embodiment, and Biologically Inspired Robotics , 2007, Science.

[30]  Trevor Darrell,et al.  Active Learning with Gaussian Processes for Object Categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[31]  Marco Mirolli,et al.  Evolution and Learning in an Intrinsically Motivated Reinforcement Learning Robot , 2007, ECAL.

[32]  Andreas Krause,et al.  Nonmyopic active learning of Gaussian processes: an exploration-exploitation approach , 2007, ICML '07.

[33]  Lyle H. Ungar,et al.  Machine Learning manuscript No. (will be inserted by the editor) Active Learning for Logistic Regression: , 2007 .

[34]  Francisco S. Melo,et al.  Convergence of Independent Adaptive Learners , 2007, EPIA Workshops.

[35]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[36]  Matthias W. Seeger,et al.  Compressed sensing and Bayesian experimental design , 2008, ICML '08.

[37]  Robert D. Nowak,et al.  Minimax Bounds for Active Learning , 2007, IEEE Transactions on Information Theory.

[38]  KrauseAndreas,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008 .

[39]  Stephen Hart,et al.  Intrinsically motivated hierarchical manipulation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[40]  Pierre-Yves Oudeyer,et al.  How can we define intrinsic motivation , 2008 .

[41]  Katharina J. Rohlfing,et al.  Attention via Synchrony: Making Use of Multimodal Cues in Social Learning , 2009, IEEE Transactions on Autonomous Mental Development.

[42]  Manuel Lopes,et al.  Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.

[43]  Nando de Freitas,et al.  A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot , 2009, Auton. Robots.

[44]  Pierre-Yves Oudeyer,et al.  R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.

[45]  Manuela M. Veloso,et al.  Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..

[46]  Masaki Ogino,et al.  Cognitive Developmental Robotics: A Survey , 2009, IEEE Transactions on Autonomous Mental Development.

[47]  Pierre-Yves Oudeyer,et al.  Robust intrinsically motivated exploration and active learning , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[48]  Lee Spector,et al.  Genetic Programming for Reward Function Search , 2010, IEEE Transactions on Autonomous Mental Development.

[49]  Richard L. Lewis,et al.  Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.

[50]  Javier R. Movellan,et al.  Infomax Control of Eye Movements , 2010, IEEE Transactions on Autonomous Mental Development.

[51]  Manuel Lopes,et al.  Body schema acquisition through active learning , 2010, 2010 IEEE International Conference on Robotics and Automation.

[52]  Stéphane Doncieux,et al.  Behavioral diversity measures for Evolutionary Robotics , 2010, IEEE Congress on Evolutionary Computation.

[53]  Kathryn E. Merrick,et al.  A Comparative Study of Value Systems for Self-Motivated Exploration and Learning by Robots , 2010, IEEE Transactions on Autonomous Mental Development.

[54]  Andrew G. Barto,et al.  Intrinsically Motivated Hierarchical Skill Learning in Structured Environments , 2010, IEEE Transactions on Autonomous Mental Development.

[55]  Maya Cakmak,et al.  Designing Interactions for Robot Active Learners , 2010, IEEE Transactions on Autonomous Mental Development.

[56]  Maria-Florina Balcan,et al.  The true sample complexity of active learning , 2010, Machine Learning.

[57]  Kenneth O. Stanley,et al.  Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.