Autonomous development of goals: From generic rewards to goal and self detection

Goals are abstractions that express agents' intention and allow them to organize their behavior appropriately. How can agents develop such goals autonomously? This paper proposes a conceptual and computational account to this longstanding problem. We argue to consider goals as abstractions of lower-level intention mechanisms such as rewards and values, and point out that goals need to be considered alongside with a detection of the own actions' effects. Then, both goals and self-detection can be learned from generic rewards. We show experimentally that task-unspecific rewards induced by visual saliency lead to self and goal representations that constitute goal-directed reaching.

[1]  Charles C. Kemp,et al.  What can I control? A framework for robot self-discovery , 2006 .

[2]  Maya Cakmak,et al.  To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control , 2007, Adapt. Behav..

[3]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[4]  C. G. Prince,et al.  Ongoing Emergence:A Core Concept in Epigenetic Robotics , 2005 .

[5]  Minoru Asada,et al.  Co-development of information transfer within and between infant and caregiver , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[6]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[7]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[8]  Jochen J. Steil,et al.  Automatic selection of task spaces for imitation learning , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  E. A. Locke,et al.  Goal setting theory. , 2012 .

[10]  Masao Ito Control of mental activities by internal models in the cerebellum , 2008, Nature Reviews Neuroscience.

[11]  Jochen J. Steil,et al.  Goal Babbling Permits Direct Learning of Inverse Kinematics , 2010, IEEE Transactions on Autonomous Mental Development.

[12]  J. Gottlieb Attention, Learning, and the Value of Information , 2012, Neuron.

[13]  K. Doya,et al.  A unifying computational framework for motor control and social interaction. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[14]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[15]  Pierre-Yves Oudeyer,et al.  Towards robots with teleological action and language understanding , 2012, Humanoids 2012.

[16]  三嶋 博之 The theory of affordances , 2008 .

[17]  Jan Peters,et al.  Model learning for robot control: a survey , 2011, Cognitive Processing.

[18]  J. Langford,et al.  The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.

[19]  M. Asada,et al.  Visual attention by saliency leads cross-modal body representation , 2008, 2008 7th IEEE International Conference on Development and Learning.

[20]  Matthias Rolf,et al.  Goal babbling with unknown ranges: A direction-sampling approach , 2013, 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[21]  Kathryn E. Merrick,et al.  A Comparative Study of Value Systems for Self-Motivated Exploration and Learning by Robots , 2010, IEEE Transactions on Autonomous Mental Development.

[22]  Gianluca Baldassarre,et al.  What are intrinsic motivations? A biological perspective , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[23]  John Baillieul,et al.  Avoiding obstacles and resolving kinematic redundancy , 1986, Proceedings. 1986 IEEE International Conference on Robotics and Automation.

[24]  Minoru Asada,et al.  Where do goals come from? A Generic Approach to Autonomous Goal-System Development , 2014, ArXiv.

[25]  B. Hommel Action control according to TEC (theory of event coding) , 2009, Psychological research.

[26]  Jochen J. Steil,et al.  Online Goal Babbling for rapid bootstrapping of inverse models in high dimensions , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[27]  Alexander Stoytchev,et al.  Self-detection in robots: a method based on detecting temporal contingencies† , 2011, Robotica.

[28]  Manuela M. Veloso,et al.  Real-time randomized path planning for robot navigation , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[30]  G. Csibra,et al.  Teleological reasoning in infancy: the naı̈ve theory of rational action , 2003, Trends in Cognitive Sciences.

[31]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[32]  J. Kevin O'Regan,et al.  What it is like to see: A sensorimotor theory of perceptual experience , 2001, Synthese.

[33]  B. Rossion,et al.  Fixation Patterns During Recognition of Personally Familiar and Unfamiliar Faces , 2010, Front. Psychology.

[34]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[35]  E. Deci,et al.  Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.

[36]  B. Hommel,et al.  Where Do Action Goals Come from? Evidence for Spontaneous Action–Effect Binding in Infants , 2010, Front. Psychology.

[37]  H. Bekkering,et al.  Imitation of gestures in children is goal-directed. , 2000, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[38]  D. Wolpert,et al.  Internal models in the cerebellum , 1998, Trends in Cognitive Sciences.