Intrinsically Motivated Learning in Natural and Artificial Systems

It has become clear to researchers in robotics and adaptive behaviour that current approaches are yielding systems with limited autonomy and capacity for self-improvement. To learn autonomously and in a cumulative fashion is one of the hallmarks of intelligence, and we know that higher mammals engage in exploratory activities that are not directed to pursue goals of immediate relevance for survival and reproduction but are instead driven by intrinsic motivations such as curiosity, interest in novel stimuli or surprising events, and interest in learning new behaviours. The adaptive value of such intrinsically motivated activities lies in the fact that they allow the cumulative acquisition of knowledge and skills that can be used later to accomplish tness-enhancing goals. Intrinsic motivations continue during adulthood, and in humans they underlie lifelong learning, artistic creativity, and scientific discovery, while they are also the basis for processes that strongly affect human well-being, such as the sense of competence, self-determination, and self-esteem. This book has two aims: to present the state of the art in research on intrinsically motivated learning, and to identify the related scientific and technological open challenges and most promising research directions. The book introduces the concept of intrinsic motivation in artificial systems, reviews the relevant literature, offers insights from the neural and behavioural sciences, and presents novel tools for research. The book is organized into six parts: the chapters in Part I give general overviews on the concept of intrinsic motivations, their function, and possible mechanisms for implementing them; Parts II, III, and IV focus on three classes of intrinsic motivation mechanisms, those based on predictors, on novelty, and on competence; Part V discusses mechanisms that are complementary to intrinsic motivations; and Part VI introduces tools and experimental frameworks for investigating intrinsic motivations.The contributing authors are among the pioneers carrying out fundamental work on this topic, drawn from related disciplines such as artificial intelligence, robotics, artificial life, evolution, machine learning, developmental psychology, cognitive science, and neuroscience. The book will be of value to graduate students and academic researchers in these domains, and to engineers engaged with the design of autonomous, adaptive robots. The contributing authors are among the pioneers carrying out fundamental work on this topic, drawn from related disciplines such as artificial intelligence, robotics, artificial life, evolution, machine learning, developmental psychology, cognitive science, and neuroscience. The book will be of value to graduate students and academic researchers in these domains, and to engineers engaged with the design of autonomous, adaptive robots.

[1]  Sebastian Thrun,et al.  Active Exploration in Dynamic Environments , 1991, NIPS.

[2]  Kevin Gurney,et al.  Action Discovery and Intrinsic Motivation: A Biologically Constrained Formalisation , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[3]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[4]  A. Cooper,et al.  Predictive Reward Signal of Dopamine Neurons , 2011 .

[5]  Yuval Shahar,et al.  Model-based visualization of temporal abstractions , 1998, Proceedings. Fifth International Workshop on Temporal Representation and Reasoning (Cat. No.98EX157).

[6]  U. Frey,et al.  The effect of dopaminergic D1 receptor blockade during tetanization on the expression of long-term potentiation in the rat CA1 region in vitro , 1991, Neuroscience Letters.

[7]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[8]  M. D. Crutcher,et al.  Single cell studies of the primate putamen , 2004, Experimental Brain Research.

[9]  Marshall M. Haith,et al.  Rules That Babies Look By: The Organization of Newborn Visual Activity , 1980 .

[10]  Giulio Sandini,et al.  Developmental robotics: a survey , 2003, Connect. Sci..

[11]  Oussama Khatib,et al.  A unified approach for motion and force control of robot manipulators: The operational space formulation , 1987, IEEE J. Robotics Autom..

[12]  Marco Mirolli,et al.  Evolving Childhood's Length and Learning Parameters in an Intrinsically Motivated Reinforcement Learning Robot , 2007 .

[13]  Emilio Bizzi,et al.  Combinations of muscle synergies in the construction of a natural motor behavior , 2003, Nature Neuroscience.

[14]  M. Goodale,et al.  Separate visual pathways for perception and action , 1992, Trends in Neurosciences.

[15]  F. Masterson,et al.  Species-specific defense reactions and avoidance learning , 1982, The Pavlovian Journal of Biological Science.

[16]  R. Bandler,et al.  Parallel circuits mediating distinct emotional coping reactions to different types of stress , 2001, Neuroscience & Biobehavioral Reviews.

[17]  A. Chemero An Outline of a Theory of Affordances , 2003, How Shall Affordances be Refined? Four Perspectives.

[18]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[19]  Zoubin Ghahramani,et al.  Solving inverse problems using an EM approach to density estimation , 1993 .

[20]  Benjamin Kuipers,et al.  Autonomous Development of a Grounded Object Ontology by a Learning Robot , 2007, AAAI.

[21]  D. Field,et al.  Integration of contours: new insights , 1999, Trends in Cognitive Sciences.

[22]  Richard F. Thompson Habituation: A history , 2009, Neurobiology of Learning and Memory.

[23]  Mark G. Packard,et al.  The dopaminergic mesencephalic projections to the hippocampal formation in the rat , 1997, Progress in Neuro-Psychopharmacology and Biological Psychiatry.

[24]  Giulio Sandini,et al.  Joint torque sensing for the upper-body of the iCub humanoid robot , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[25]  H. Happe,et al.  Localization and quantification of the dopamine transporter: comparison of [3H]WIN 35,428 and [125I]RTI-55 , 1995, Brain Research.

[26]  Timothy Edward John Behrens,et al.  Training induces changes in white matter architecture , 2009, Nature Neuroscience.

[27]  Kathryn E. Merrick,et al.  Novelty and Beyond: Towards Combined Motivation Models and Integrated Learning Architectures , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[28]  Hugo Vieira Neto,et al.  Real-time Automated Visual Inspection using Mobile Robots , 2007, J. Intell. Robotic Syst..

[29]  P. L. Adams THE ORIGINS OF INTELLIGENCE IN CHILDREN , 1976 .

[30]  S. Sesack,et al.  Immunolocalization of the cocaine‐ and antidepressant‐sensitive l‐norepinephrine transporter , 2000, The Journal of comparative neurology.

[31]  P. Glow,et al.  Sound and light preference behaviour in naive adult rats , 1972 .

[32]  Timothy E. J. Behrens,et al.  Choice, uncertainty and value in prefrontal and cingulate cortex , 2008, Nature Neuroscience.

[33]  Douglas B. Lenat,et al.  AM, an artificial intelligence approach to discovery in mathematics as heuristic search , 1976 .

[34]  Daniel E. Koditschek,et al.  Exact robot navigation using artificial potential functions , 1992, IEEE Trans. Robotics Autom..

[35]  Faustino J. Gomez,et al.  Intrinsically Motivated Evolutionary Search for Vision-Based Reinforcement Learning , 2011 .

[36]  R. Decharms Personal causation : the internal affective determinants of behavior , 1968 .

[37]  L. S. Vygotskiĭ,et al.  Mind in society : the development of higher psychological processes , 1978 .

[38]  Ronald A. Howard,et al.  Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..

[39]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[40]  E. Deci,et al.  Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.

[41]  Jürgen Schmidhuber,et al.  Gödel Machines: Towards a Technical Justification of Consciousness , 2005, Adaptive Agents and Multi-Agent Systems.

[42]  E. N. Sokolov Higher nervous functions; the orienting reflex. , 1963, Annual review of physiology.

[43]  K. Hikosaka,et al.  Coding and Monitoring of Motivational Context in the Primate Prefrontal Cortex , 2002, The Journal of Neuroscience.

[44]  J. Stevens,et al.  Animal Intelligence , 1883, Nature.

[45]  Zoubin Ghahramani,et al.  Perspectives and problems in motor learning , 2001, Trends in Cognitive Sciences.

[46]  David Vernon,et al.  A Roadmap for Cognitive Development in Humanoid Robots , 2011, Cognitive Systems Monographs.

[47]  T. Stanford,et al.  Subcortical loops through the basal ganglia , 2005, Trends in Neurosciences.

[48]  Nico Bunzeck,et al.  Reward Motivation Accelerates the Onset of Neural Novelty Signals in Humans to 85 Milliseconds , 2009, Current Biology.

[49]  Nelson Goodman,et al.  Languages of Art, an Approach to a Theory of Symbols , 1970 .

[50]  J. Wickens A Theory of the Striatum , 1993 .

[51]  Angelo Cangelosi,et al.  Multiple Time Scales Recurrent Neural Network for Complex Action Acquisition , 2011 .

[52]  Jon Rigelsford,et al.  Modelling and Control of Robot Manipulators , 2000 .

[53]  Jürgen Schmidhuber,et al.  Optimal Ordered Problem Solver , 2002, Machine Learning.

[54]  Christopher D. Adams,et al.  Instrumental Responding following Reinforcer Devaluation , 1981 .

[55]  C. Verney,et al.  Mesolimbic dopaminergic neurons innervating the hippocampal formation in the rat: a combined retrograde tracing and immunohistochemical study , 1994, Brain Research.

[56]  Kevin Gurney,et al.  The Role of the Basal Ganglia in Discovering Novel Actions , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[57]  Peter Redgrave,et al.  Basal Ganglia , 2020, Encyclopedia of Autism Spectrum Disorders.

[58]  Christian M. Ernst,et al.  Multi-armed Bandit Allocation Indices , 1989 .

[59]  Juyang Weng,et al.  Inherent Value Systems for Autonomous Mental Development , 2007, Int. J. Humanoid Robotics.

[60]  Scott P. Johnson,et al.  Increasing spatial competition enhances visual prediction learning , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[61]  H. Frank Kybernetische Analysen : subjektiver Sachverhalte , 1964 .

[62]  J. Lisman,et al.  D1/D5 Dopamine Receptor Activation Increases the Magnitude of Early Long-Term Potentiation at CA1 Hippocampal Synapses , 1996, The Journal of Neuroscience.

[63]  Antonio Amodeo,et al.  Nano- and microrobotics: how far is the reality? , 2008, Expert review of anticancer therapy.

[64]  Stephen Hart,et al.  Learning Generalizable Control Programs , 2011, IEEE Transactions on Autonomous Mental Development.

[65]  Giulio Sandini,et al.  Cognitive robotics - active perception of the self and others , 2011, 2011 4th International Conference on Human System Interactions, HSI 2011.

[66]  G. Edelman,et al.  Solving Bernstein's problem: a proposal for the development of coordinated movement by selection. , 1993, Child development.

[67]  Jürgen Schmidhuber,et al.  Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[68]  T. Nokes,et al.  Intrinsic reinforcing properties of putatively neutral stimuli in an instrumental two-lever discrimination task , 1996 .

[69]  S. Sajikumar,et al.  Synergistic requirements for the induction of dopaminergic D1/D5-receptor-mediated LTP in hippocampal slices of rat CA1 in vitro , 2007, Neuropharmacology.

[70]  Alin Albu-Schäffer,et al.  Requirements for Safe Robots: Measurements, Analysis and New Insights , 2009, Int. J. Robotics Res..

[71]  Anne Condon,et al.  On the undecidability of probabilistic planning and related stochastic optimization problems , 2003, Artif. Intell..

[72]  Jonathan D. Cohen,et al.  An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. , 2005, Annual review of neuroscience.

[73]  M. Haith,et al.  Expectation and anticipation of dynamic visual events by 3.5-month-old babies. , 1988, Child development.

[74]  C. Kemp,et al.  What Can I Control ? : The Development of Visual Categories for a Robot ’ s Body and the World that it Influences , 2006 .

[75]  Gianluca Baldassarre,et al.  A modular neural-network model of the basal ganglia’s role in learning and selecting motor behaviours , 2002, Cognitive Systems Research.

[76]  Daniel Acuna Improving Bayesian Reinforcement Learning Using Transition Abstraction , 2009 .

[77]  Marco Mirolli,et al.  Biological Cumulative Learning through Intrinsic Motivations: A Simulated Robotic Study on the Development of Visually-Guided Reaching , 2010, EpiRob.

[78]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[79]  John D. Mollon,et al.  Signals invisible to the collicular and magnocellular pathways can capture visual attention but do not produce an oculomotor distractor effect , 2010 .

[80]  H. Heinze,et al.  The Dopaminergic Midbrain Participates in Human Episodic Memory Formation: Evidence from Genetic Imaging , 2006, The Journal of Neuroscience.

[81]  J. Wickens,et al.  Short-Latency Activation of Striatal Spiny Neurons via Subcortical Visual Pathways , 2009, The Journal of Neuroscience.

[82]  Mark H. Johnson Cortical Maturation and the Development of Visual Attention in Early Infancy , 1990, Journal of Cognitive Neuroscience.

[83]  Salvatore J. Stolfo,et al.  A framework for constructing features and models for intrusion detection systems , 2000, TSEC.

[84]  K. Berridge,et al.  Emotional environments retune the valence of appetitive versus fearful functions in nucleus accumbens , 2008, Nature Neuroscience.

[85]  Owen Holland,et al.  An investigation of two mediation strategies suitable for behavioural control in animals and animats , 1991 .

[86]  Kathryn E. Merrick,et al.  Agent Models for Self‐Motivated Home‐Assistant Bots , 2010 .

[87]  Stuart J. Russell,et al.  Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[88]  K. Berridge,et al.  Positive and Negative Motivation in Nucleus Accumbens Shell: Bivalent Rostrocaudal Gradients for GABA-Elicited Eating, Taste “Liking”/“Disliking” Reactions, Place Preference/Avoidance, and Fear , 2002, The Journal of Neuroscience.

[89]  A. Barto,et al.  ScholarWorks@UMass Amherst , 2022 .

[90]  David M. Santucci,et al.  A Biologically Plausible Transform for Visual Recognition that is Invariant to Translation, Scale, and Rotation , 2011, Front. Comput. Neurosci..

[91]  Ruth M. Krebs,et al.  Novelty increases the mesolimbic functional connectivity of the substantia nigra/ventral tegmental area (SN/VTA) during reward anticipation: Evidence from high-resolution fMRI , 2011, NeuroImage.

[92]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[93]  James C. Houk,et al.  Agents of the mind , 2005, Biological Cybernetics.

[94]  Pierre-Yves Oudeyer,et al.  Intrinsically motivated goal exploration for active motor learning in robots: A case study , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[95]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[96]  R. Pfeifer,et al.  Self-Organization, Embodiment, and Biologically Inspired Robotics , 2007, Science.

[97]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[98]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[99]  Giorgio Metta,et al.  Online multiple instance learning applied to hand detection in a humanoid robot , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[100]  Peter Redgrave,et al.  Short-Latency Visual Input to the Subthalamic Nucleus Is Provided by the Midbrain Superior Colliculus , 2009, The Journal of Neuroscience.

[101]  Junichiro Yoshimoto,et al.  Control of exploitation-exploration meta-parameter in reinforcement learning , 2002, Neural Networks.

[102]  Jonathan D. Cohen,et al.  Computational roles for dopamine in behavioural control , 2004, Nature.

[103]  Stephen Hart,et al.  Generalization and Transfer in Robot Control , 2008 .

[104]  T. Flash,et al.  The coordination of arm movements: an experimentally confirmed mathematical model , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[105]  C. Nicholson,et al.  Dopamine-mediated volume transmission in midbrain is regulated by distinct extracellular geometry and uptake. , 2001, Journal of neurophysiology.

[106]  Fumiya Iida,et al.  "Cheap" Rapid Locomotion of a Quadruped Robot: Self-Stabilization of Bounding Gait , 2004 .

[107]  P. Redgrave,et al.  Cortico-striatal plasticity for action-outcome learning using spike timing dependent eligibility , 2009, BMC Neuroscience.

[108]  D. Spalding The Principles of Psychology , 1873, Nature.

[109]  W. Schultz Behavioral dopamine signals , 2007, Trends in Neurosciences.

[110]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[111]  R. Wurtz,et al.  Visual and oculomotor functions of monkey substantia nigra pars reticulata. I. Relation of visual and auditory responses to saccades. , 1983, Journal of neurophysiology.

[112]  Mitsuo Kawato,et al.  Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.

[113]  Kevin Gurney,et al.  A Novel Behavioural Task for Researching Intrinsic Motivations , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[114]  Kathryn E. Merrick,et al.  Achievement, affiliation, and power: Motive profiles for artificial agents , 2011, Adapt. Behav..

[115]  Peter Dayan,et al.  Non-commercial Research and Educational Use including without Limitation Use in Instruction at Your Institution, Sending It to Specific Colleagues That You Know, and Providing a Copy to Your Institution's Administrator. All Other Uses, Reproduction and Distribution, including without Limitation Comm , 2022 .

[116]  Alexandre Bernardino,et al.  Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub , 2008, 2008 IEEE International Conference on Robotics and Automation.

[117]  A. Grace,et al.  Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission , 2003, Nature Neuroscience.

[118]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[119]  H. Ginsburg,et al.  Piaget's theory of intellectual development , 1969 .

[120]  Hugo Vieira Neto,et al.  Incremental PCA: an alternative approach for novelty detection , 2005 .

[121]  Andrew G. Barto,et al.  Intrinsically Motivated Hierarchical Skill Learning in Structured Environments , 2010, IEEE Transactions on Autonomous Mental Development.

[122]  G. Bronson,et al.  Infant differences in rate of visual encoding. , 1991, Child development.

[123]  D. Munoz,et al.  On the importance of the transient visual response in the superior colliculus , 2008, Current Opinion in Neurobiology.

[124]  I. Izquierdo,et al.  Dopamine Controls Persistence of Long-Term Memory Storage , 2009, Science.

[125]  Olivier Sigaud,et al.  On-line regression algorithms for learning mechanical models of robots: A survey , 2011, Robotics Auton. Syst..

[126]  Chrystopher L. Nehaniv,et al.  Empowerment: a universal agent-centric measure of control , 2005, 2005 IEEE Congress on Evolutionary Computation.

[127]  E. Wasserman,et al.  Comparative cognition : experimental explorations of animal intelligence , 2009 .

[128]  Further Particulars GATSBY COMPUTATIONAL NEUROSCIENCE UNIT , 2003 .

[129]  A. Allport,et al.  Selection for action: Some behavioral and neurophysiological considerations of attention and action , 1987 .

[130]  S. Killcross,et al.  Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.

[131]  T. Stafford,et al.  Modelling Natural Action Selection: Biologically constrained action selection improves cognitive control in a model of the Stroop task , 2011 .

[132]  L. Quintin,et al.  Variations in 3,4-dihydroxyphenylacetic acid concentration are correlated to single cell firing changes in the rat locus coeruleus , 1986, Neuroscience.

[133]  Maya Cakmak,et al.  To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control , 2007, Adapt. Behav..

[134]  Alcino J. Silva,et al.  Autophosphorylation at Thr286 of the alpha calcium-calmodulin kinase II in LTP and learning. , 1998, Science.

[135]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[136]  Frank Neugebauer,et al.  Modulation of extracellular monoamine transmitter concentrations in the hippocampus after weak and strong tetanization of the perforant path in freely moving rats , 2009, Brain Research.

[137]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[138]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[139]  D. Sparks,et al.  Sensorimotor integration in the primate superior colliculus. I. Motor convergence. , 1987, Journal of neurophysiology.

[140]  C. Breazeal,et al.  Experiments in socially guided exploration: lessons learned in building robots that learn with and without human teachers , 2008, Connect. Sci..

[141]  Jürgen Schmidhuber,et al.  Simple algorithmic theory of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes (特集 高次機能の学習と創発--脳・ロボット・人間研究における新たな展開) , 2009 .

[142]  Andrew Y. Ng,et al.  Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[143]  Ulrich Nehmzow,et al.  Environment-specific novelty detection , 2002 .

[144]  Martin V. Butz,et al.  Anticipatory Behavior in Adaptive Learning Systems , 2003, Lecture Notes in Computer Science.

[145]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[146]  Jürgen Schmidhuber,et al.  Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts , 2005 .

[147]  Pierre-Yves Oudeyer,et al.  Bio-inspired vertebral column, compliance and semi-passive dynamics in a lightweight robot , 2011 .

[148]  Gerald H. Jacobs PII: S0042-6989(97)00405-7 , 1998 .

[149]  R. Wightman,et al.  Pharmacologically induced, subsecond dopamine transients in the caudate–putamen of the anesthetized rat , 2007, Synapse.

[150]  Y. Smith,et al.  The thalamostriatal system: a highly specific network of the basal ganglia circuitry , 2004, Trends in Neurosciences.

[151]  Amir Hussain,et al.  Controlled and Automatic Processing in Animals and Machines with Application to Autonomous Vehicle Control , 2009, ICANN.

[152]  Michael J. Frank,et al.  Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia , 2006, Neural Computation.

[153]  Marco Mirolli,et al.  Functions and Mechanisms of Intrinsic Motivations , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[154]  Sonenberg,et al.  A Curious Agent for Network Anomaly Detection , 2010 .

[155]  Andrew G. Barto,et al.  An intrinsic reward mechanism for efficient exploration , 2006, ICML.

[156]  B. Bunney,et al.  Firing properties of substantia nigra dopaminergic neurons in freely moving rats. , 1985, Life sciences.

[157]  Andrew W. Moore,et al.  Fast, Robust Adaptive Control by Learning only Forward Models , 1991, NIPS.

[158]  Richard L. Lewis,et al.  Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.

[159]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[160]  Alexandre Bernardino,et al.  Sensor-based self-calibration of the iCub's head , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[161]  X. Yao Evolving Artificial Neural Networks , 1999 .

[162]  Menek Goldstein,et al.  Activation of the locus coeruleus induced by selective stimulation of the ventral tegmental area , 1986, Brain Research.

[163]  Tom Schaul,et al.  Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.

[164]  Terrence J. Sejnowski,et al.  Exploration Bonuses and Dual Control , 1996, Machine Learning.

[165]  A. Meltzoff,et al.  Imitation of Facial and Manual Gestures by Human Neonates , 1977, Science.

[166]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[167]  Mark D. Humphries,et al.  A robot model of the basal ganglia: Behavior and intrinsic processing , 2006, Neural Networks.

[168]  N. White Reward or reinforcement: What's the difference? , 1989, Neuroscience & Biobehavioral Reviews.

[169]  G A Lucas,et al.  The basis of superstitious behavior: chance contingency, stimulus substitution, or appetitive behavior? , 1985, Journal of the experimental analysis of behavior.

[170]  D L Sparks,et al.  Translation of sensory signals into commands for control of saccadic eye movements: role of primate superior colliculus. , 1986, Physiological reviews.

[171]  Peter Redgrave,et al.  Collateralization of the tectonigral projection with other major output pathways of superior colliculus in the rat , 2007, The Journal of comparative neurology.

[172]  R. Wise,et al.  Novelty‐evoked elevations of nucleus accumbens dopamine: dependence on impulse flow from the ventral subiculum and glutamatergic neurotransmission in the ventral tegmental area , 2001, The European journal of neuroscience.

[173]  Danica Kragic,et al.  Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object-Action complexes , 2008, Int. J. Humanoid Robotics.

[174]  Peter Dayan,et al.  Serotonin, Inhibition, and Negative Mood , 2007, PLoS Comput. Biol..

[175]  Kathryn E. Merrick,et al.  A Comparative Study of Value Systems for Self-Motivated Exploration and Learning by Robots , 2010, IEEE Transactions on Autonomous Mental Development.

[176]  Gianluca Baldassarre,et al.  What are intrinsic motivations? A biological perspective , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[177]  Edward L. Deci,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[178]  Ethan S. Bromberg-Martin,et al.  Dopamine in Motivational Control: Rewarding, Aversive, and Alerting , 2010, Neuron.

[179]  Geoffrey E. Hinton,et al.  Feudal Reinforcement Learning , 1992, NIPS.

[180]  Denis Mareschal,et al.  An Interacting Systems Model of Infant Habituation , 2004, Journal of Cognitive Neuroscience.

[181]  Andrew G. Barto,et al.  Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.

[182]  James H. Aylor,et al.  Computer for the 21st Century , 1999, Computer.

[183]  W. Stolz Information Theory and Esthetic Perception. , 1967 .

[184]  Catharine H. Rankin,et al.  Introduction to special issue of neurobiology of learning and memory on habituation , 2009, Neurobiology of Learning and Memory.

[185]  G. E. Alexander,et al.  Parallel organization of functionally segregated circuits linking basal ganglia and cortex. , 1986, Annual review of neuroscience.

[186]  Lihong Li,et al.  A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.

[187]  Giorgio Metta,et al.  A tactile sensor for the fingertips of the humanoid robot iCub , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[188]  Robert Platt,et al.  Null-Space Grasp Control: Theory and Experiments , 2010, IEEE Transactions on Robotics.

[189]  Stephen Hart,et al.  Natural task decomposition with intrinsic potential fields , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[190]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[191]  S. Nelson,et al.  Multiple forms of short-term plasticity at excitatory synapses in rat medial prefrontal cortex. , 2000, Journal of neurophysiology.

[192]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[193]  A. Friederici,et al.  Why the P600 is not just a P300: the role of the basal ganglia , 2003, Clinical Neurophysiology.

[194]  Henk Nijmeijer,et al.  Robot Programming by Demonstration , 2010, SIMPAR.

[195]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[196]  Robert Jensen,et al.  Behaviorism, latent learning, and cognitive maps: Needed revisions in introductory psychology textbooks , 2006, The Behavior analyst.

[197]  Stephen Hart,et al.  Intrinsically Motivated Affordance Discovery and Modeling , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[198]  Donald A. Wilson,et al.  Habituation revisited: An updated and revised description of the behavioral characteristics of habituation , 2009, Neurobiology of Learning and Memory.

[199]  A. Grace,et al.  The tonic/phasic model of dopamine system regulation: its relevance for understanding how stimulant abuse can alter basal ganglia function. , 1995, Drug and alcohol dependence.

[200]  Lorenzo Natale A study on YARP Performance , 2009 .

[201]  Hans-Jochen Heinze,et al.  Mesolimbic novelty processing in older adults. , 2007, Cerebral cortex.

[202]  Georgios C. Anagnostopoulos,et al.  Knowledge-Based Intelligent Information and Engineering Systems , 2003, Lecture Notes in Computer Science.

[203]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[204]  P. Marchiafava,et al.  Unit responses to visual stimuli in the superior colliculus of the unanesthetized, mid-pontine cat. , 1968, Archives italiennes de biologie.

[205]  E. N. Solokov Perception and the conditioned reflex , 1963 .

[206]  Scott P. Johnson,et al.  Learning by selection: visual search and object perception in young infants. , 2006, Developmental psychology.

[207]  W. Wundt,et al.  Principles of physiological psychology , 2015 .

[208]  Xiao Huang,et al.  Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .

[209]  Peter Redgrave,et al.  A computational model of action selection in the basal ganglia. II. Analysis and simulation of behaviour , 2001, Biological Cybernetics.

[210]  Floyd C. Mace,et al.  Schedules of reinforcement , 2011 .

[211]  F. Ballarini,et al.  Behavioral tagging is a general mechanism of long-term memory formation , 2009, Proceedings of the National Academy of Sciences.

[212]  Rob Saunders,et al.  Curious Design Agents and Artificial Creativity - A Synthetic Approach to the Study of Creative Behaviour , 2001 .

[213]  Paul Cisek,et al.  Cortical mechanisms of action selection: the affordance competition hypothesis , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[214]  Nuttapong Chentanez,et al.  Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[215]  Francesco Antinucci Cognitive structure and development in nonhuman primates , 1993 .

[216]  W. Cowan,et al.  A study of subcortical afferents to the hippocampal formation in the rat , 1979, Neuroscience.

[217]  Gianluca Baldassarre,et al.  Planning with neural networks and reinforcement learning , 2001 .

[218]  P. Rochat Object Manipulation and Exploration in 2-to 5-Month-Old Infants , 2001 .

[219]  R. Hill,et al.  Effect of Removing the Neocortex on the Response to Repeated Sensory Stimulation of Neurones in the Mid-brain , 1966, Nature.

[220]  Henrik I. Christensen,et al.  Evolutionary Development of Hierarchical Learning Structures , 2007, IEEE Transactions on Evolutionary Computation.

[221]  Wenwei Yu,et al.  Mutual Adaptation in a Prosthetics Application , 2003, Embodied Artificial Intelligence.

[222]  Y. Agid,et al.  Labeled wheat germ agglutinin (WGA) as a new, highly sensitive retrograde tracer in the rat brain hippocampal system , 1978, Brain Research.

[223]  H. Yin,et al.  The role of the basal ganglia in habit formation , 2006, Nature Reviews Neuroscience.

[224]  Jürgen Schmidhuber,et al.  A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[225]  W. Schultz Multiple reward signals in the brain , 2000, Nature Reviews Neuroscience.

[226]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[227]  Michel Geffard,et al.  First demonstration of highly specific and sensitive antibodies against dopamine , 1984, Brain Research.

[228]  Thomas G. Dietterich The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[229]  K Fuxe,et al.  Pharmaco-histochemical evidence of the existence of dopamine nerve terminals in the limbic cortex. , 1974, European journal of pharmacology.

[230]  David A. Cohn,et al.  Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[231]  Domenico Formica,et al.  A mechatronic platform for behavioral analysis on nonhuman primates. , 2012, Journal of integrative neuroscience.

[232]  Herve Simon,et al.  Origin of dopaminergic innervation of the rat hippocampal formation , 1980, Neuroscience Letters.

[233]  Jesse Hoey,et al.  An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.

[234]  B. McNaughton,et al.  Differential modulation of CA1 and dentate gyrus interneurons during exploration of novel environments. , 2004, Journal of neurophysiology.

[235]  Jürgen Schmidhuber,et al.  Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.

[236]  S. Kapur,et al.  Dopamine, prediction error and associative learning: A model-based account , 2006, Network.

[237]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[238]  Risto Miikkulainen,et al.  Efficient Non-linear Control Through Neuroevolution , 2006, ECML.

[239]  Emrah Duzel,et al.  A neoHebbian framework for episodic memory; role of dopamine-dependent late LTP , 2011, Trends in Neurosciences.

[240]  W. Schultz Behavioral theories and the neurophysiology of reward. , 2006, Annual review of psychology.

[241]  L. Frank,et al.  New Experiences Enhance Coordinated Neural Activity in the Hippocampus , 2008, Neuron.

[242]  R. Grupen,et al.  Nonholonomic Path Planning Using Harmonic Functions , 1994 .

[243]  Jürgen Schmidhuber,et al.  Artificial Scientists & Artists Based on the Formal Theory of Creativity , 2010, AGI 2010.

[244]  Vincenzo Perciavalle,et al.  The projections of the retrorubral field A8 to the hippocampal formation in the rat , 1996, Experimental Brain Research.

[245]  Jun Tani,et al.  Achieving "organic compositionality" through self-organization: Reviews on brain-inspired robotics experiments , 2008, Neural Networks.

[246]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[247]  Jürgen Schmidhuber,et al.  Low-Complexity Art , 2017 .

[248]  P. Goldman-Rakic,et al.  Dopamine synaptic complex with pyramidal neurons in primate cerebral cortex. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[249]  J. Colombo,et al.  Infant visual habituation , 2009, Neurobiology of Learning and Memory.

[250]  Mark Steedman,et al.  Object-Action Complexes: Grounded abstractions of sensory-motor processes , 2011, Robotics Auton. Syst..

[251]  Hugo Vieira Neto,et al.  Novelty-based visual inspection using mobile robots , 2004 .

[252]  Sachin C. Patwardhan,et al.  Experiment design, identification and control in large-scale chemical processes , 2010, Proceedings of the 2010 International Conference on Modelling, Identification and Control.

[253]  Victor Raskin,et al.  Semantic mechanisms of humor , 1984 .

[254]  Ken E. Whelan,et al.  The Automation of Science , 2009, Science.

[255]  V. Ferrera,et al.  Modification of Saccades Evoked by Stimulation of Frontal Eye Field during Invisible Target Tracking , 2004, The Journal of Neuroscience.

[256]  Ofi rNw8x'pyzm,et al.  The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002 .

[257]  N. Bunzeck,et al.  Absolute Coding of Stimulus Novelty in the Human Substantia Nigra/VTA , 2006, Neuron.

[258]  Corso Elvezia What's Interesting? , 1997 .

[259]  M. Deschenes,et al.  Corticostriatal projections from layer V cells in rat are collaterals of long-range corticofugal axons , 1996, Brain Research.

[260]  E. Miller,et al.  An integrative theory of prefrontal cortex function. , 2001, Annual review of neuroscience.

[261]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[262]  M. Stryker,et al.  The Role of Activity in the Development of Long-Range Horizontal Connections in Area 17 of the Ferret , 1996, The Journal of Neuroscience.

[263]  Philip Ball,et al.  The Self-Made Tapestry: Pattern Formation in Nature , 1999 .

[264]  R. G. M. Helali Data Mining Based Network Intrusion Detection System: A Survey , 2008, TeNe.

[265]  Geoffrey E. Hinton,et al.  Generative models for discovering sparse distributed representations. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[266]  Jonathan D. Cohen,et al.  Learning to selectively attend , 2010 .

[267]  R. Sutton,et al.  Off-Policy Knowledge Maintenance for Robots , 2010 .

[268]  A. Gasbarri,et al.  Organization of the projections from the ventral tegmental area of Tsai to the hippocampal formation in the rat. , 1991, Journal fur Hirnforschung.

[269]  Scott P. Johnson,et al.  Newborn infant's perception of partly occluded objects , 1996 .

[270]  S.M. Harris,et al.  Information Processing , 1977, Nature.

[271]  Pierre-Yves Oudeyer,et al.  Intrinsically Motivated Machines , 2006, 50 Years of Artificial Intelligence.

[272]  A. Tversky,et al.  Prospect theory: an analysis of decision under risk — Source link , 2007 .

[273]  Stefan Schaal,et al.  Robot learning by nonparametric regression , 1994, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'94).

[274]  U. Frey,et al.  Dopaminergic antagonists prevent long-term maintenance of posttetanic LTP in the CA1 region of rat hippocampal slices , 1990, Brain Research.

[275]  Rolf Pfeifer,et al.  How the body shapes the way we think - a new view on intelligence , 2006 .

[276]  Okihide Hikosaka,et al.  A neural correlate of motivational conflict in the superior colliculus of the macaque. , 2008, Journal of neurophysiology.

[277]  J. Lisman,et al.  The Hippocampal-VTA Loop: Controlling the Entry of Information into Long-Term Memory , 2005, Neuron.

[278]  Deepak Kumar,et al.  BRINGING UP ROBOT: FUNDAMENTAL MECHANISMS FOR CREATING A SELF-MOTIVATED, SELF-ORGANIZING ARCHITECTURE , 2005, Cybern. Syst..

[279]  Peter Stone,et al.  Empowerment for continuous agent—environment systems , 2011, Adapt. Behav..

[280]  James L. McGaugh,et al.  Evidence for dopamine as a transmitter in dorsal hippocampus , 1982, Brain Research.

[281]  Andrew G. Barto,et al.  Efficient skill learning using abstraction selection , 2009, IJCAI 2009.

[282]  Giorgio Metta,et al.  A prototype fingertip with high spatial resolution pressure sensing for the robot iCub , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[283]  Giulio Sandini,et al.  Learning about objects through action - initial steps towards artificial cognition , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[284]  Nirvana Meratnia,et al.  Outlier Detection Techniques for Wireless Sensor Networks: A Survey , 2008, IEEE Communications Surveys & Tutorials.

[285]  W. K. Cullen,et al.  Dopamine-dependent facilitation of LTP induction in hippocampal CA1 by exposure to spatial novelty , 2003, Nature Neuroscience.

[286]  Benjamin Kuipers,et al.  Bootstrap learning of foundational representations , 2006, Connect. Sci..

[287]  T. Dennis,et al.  The formation of deaminated metabolites of dopamine in the locus coeruleus depends upon noradrenergic neuronal activity , 1985, Brain Research.

[288]  R. Paget The Origin of Speech , 1927, Nature.

[289]  Manuel Lopes,et al.  Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.

[290]  K. Berridge The debate over dopamine’s role in reward: the case for incentive salience , 2007, Psychopharmacology.

[291]  O. Johansson,et al.  Dopamine Nerve Terminals in the Rat Limbic Cortex: Aspects of the Dopamine Hypothesis of Schizophrenia , 1974, Science.

[292]  J. Konczak,et al.  The development of goal-directed reaching in infants II. Learning to produce task-adequate patterns of joint torque , 1997, Experimental Brain Research.

[293]  W. Schultz,et al.  Influences of Rewarding and Aversive Outcomes on Activity in Macaque Lateral Prefrontal Cortex , 2006, Neuron.

[294]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[295]  Kathryn E. Merrick Modeling Behavior Cycles as a Value System for Developmental Robots , 2010, Adapt. Behav..

[296]  Risto Miikkulainen,et al.  Developing navigation behavior through self-organizing distinctive-state abstraction , 2006, Connect. Sci..

[297]  Michael Kearns,et al.  Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.

[298]  J. Deakin,et al.  5-HT and mechanisms of defence , 1991, Journal of psychopharmacology.

[299]  Andrew G. Barto,et al.  Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.

[300]  P. Read Montague,et al.  When Things Are Better or Worse than Expected: The Medial Frontal Cortex and the Allocation of Processing Resources , 2006, Journal of Cognitive Neuroscience.

[301]  E. Rolls The orbitofrontal cortex and reward. , 2000, Cerebral cortex.

[302]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[303]  Peter Redgrave,et al.  Layered Control Architectures in Robots and Vertebrates , 1999, Adapt. Behav..

[304]  三嶋 博之 The theory of affordances , 2008 .

[305]  Nikolaos G. Tsagarakis,et al.  Lower body realization of the baby humanoid - ‘iCub’ , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[306]  David R. Thompson,et al.  Domain-Guided Novelty Detection for Autonomous Exploration , 2009, IJCAI.

[307]  Peter H. Glow,et al.  Response-contingent sensory change in a causally structured environment , 1978 .

[308]  S. Thorpe,et al.  Seeking Categories in the Brain , 2001, Science.

[309]  D. Bjorklund The role of immaturity in human development. , 1997, Psychological bulletin.

[310]  M. Botvinick,et al.  Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.

[311]  P. Goldman-Rakic,et al.  The role of D1-dopamine receptor in working memory: local injections of dopamine antagonists into the prefrontal cortex of rhesus monkeys performing an oculomotor delayed-response task. , 1994, Journal of neurophysiology.

[312]  M. T. Shipley,et al.  Columnar organization in the midbrain periaqueductal gray: modules for emotional expression? , 1994, Trends in Neurosciences.

[313]  Jonathan R. Whitlock,et al.  Learning Induces Long-Term Potentiation in the Hippocampus , 2006, Science.

[314]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[315]  Tatsuo K Sato,et al.  Correlated Coding of Motivation and Outcome of Decision by Dopamine Neurons , 2003, The Journal of Neuroscience.

[316]  W. Schultz,et al.  Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[317]  M. Bornstein,et al.  Continuity in mental development from infancy. , 1986, Child development.

[318]  W. Hershberger An approach through the looking-glass , 1986 .

[319]  Mark H. Lee,et al.  Staged Competence Learning in Developmental Robotics , 2007, Adapt. Behav..

[320]  Karl J. Friston,et al.  Action and behavior: a free-energy formulation , 2010, Biological Cybernetics.

[321]  E. Deci,et al.  Extrinsic Rewards and Intrinsic Motivation in Education: Reconsidered Once Again , 2001 .

[322]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[323]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[324]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.

[325]  Hervé Simon,et al.  Efferents and afferents of the ventral tegmental-A10 region studied after local injection of [3H]leucine and horseradish peroxidase , 1979, Brain Research.

[326]  Angelo Cangelosi,et al.  Aquila: An open-source GPU-accelerated toolkit for cognitive and neuro-robotics research , 2011, The 2011 International Joint Conference on Neural Networks.

[327]  O. Hikosaka,et al.  Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward. , 1989, Journal of neurophysiology.

[328]  Claes von Hofsten,et al.  Action in development. , 2007, Developmental science.

[329]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[330]  Marco Mirolli,et al.  Evolution and Learning in an Intrinsically Motivated Reinforcement Learning Robot , 2007, ECAL.

[331]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[332]  S. Schaal,et al.  Robot juggling: implementation of memory-based learning , 1994, IEEE Control Systems.

[333]  M. Geffard,et al.  Ultrastructural immunocytochemical study of the dopaminergic innervation of the rat lateral septum with anti-dopamine antibodies , 1984, Neuroscience.

[334]  F. Clarac,et al.  Localization and organization of the central pattern generator for hindlimb locomotion in newborn rat , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[335]  Rick O. Gilmore,et al.  Examining individual differences in infants’ habituation patterns using objective quantitative techniques , 2002 .

[336]  P. Dayan,et al.  A Bayesian formulation of behavioral control , 2009, Cognition.

[337]  Pierre-Yves Oudeyer,et al.  The progress drive hypothesis: an interpretation of early imitation , 2007 .

[338]  J. Gray The neuropsychology of anxiety. , 1985, Issues in mental health nursing.

[339]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[340]  Robert D. Nowak,et al.  Minimax Bounds for Active Learning , 2007, IEEE Transactions on Information Theory.

[341]  A. Laverghetta,et al.  Differential morphology of pyramidal tract‐type and intratelencephalically projecting‐type corticostriatal neurons and their intrastriatal terminals in rats , 2003, The Journal of comparative neurology.

[342]  R. Grantyn,et al.  Gaze control through superior colliculus: structure and function. , 1988, Reviews of oculomotor research.

[343]  Andrew G. Barto,et al.  Competence progress intrinsic motivation , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[344]  Jürgen Schmidhuber,et al.  Completely Self-referential Optimal Reinforcement Learners , 2005, ICANN.

[345]  Chrystopher L. Nehaniv,et al.  From unknown sensors and actuators to actions grounded in sensorimotor perceptions , 2006, Connect. Sci..

[346]  L. Steels Self-organising vocabularies , 1996 .

[347]  Marcus Hutter,et al.  Universal Artificial Intellegence - Sequential Decisions Based on Algorithmic Probability , 2005, Texts in Theoretical Computer Science. An EATCS Series.

[348]  Pierre-Yves Oudeyer,et al.  In Search of the Neural Circuits of Intrinsic Motivation , 2007, Front. Neurosci..

[349]  Giulio Sandini,et al.  The iCub humanoid robot: an open platform for research in embodied cognition , 2008, PerMIS.

[350]  Malcolm J. A. Strens,et al.  A Bayesian Framework for Reinforcement Learning , 2000, ICML.

[351]  Giorgio Metta,et al.  Towards long-lived robot genes , 2008, Robotics Auton. Syst..

[352]  Tom Schaul,et al.  Curiosity-driven optimization , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[353]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[354]  Natasha Z. Kirkham,et al.  Infant Cortical Development and the Prospective Control of Saccadic Eye Movements , 2001 .

[355]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[356]  D. Blanchard,et al.  Ethoexperimental approaches to the biology of emotion. , 1988, Annual review of psychology.

[357]  Jürgen Schmidhuber,et al.  Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes , 2008, ABiALS.

[358]  B. Balleine,et al.  The role of prelimbic cortex in instrumental conditioning , 2003, Behavioural Brain Research.

[359]  J. Mayhew,et al.  How Visual Stimuli Activate Dopaminergic Neurons at Short Latency , 2005, Science.

[360]  Reda Alhajj,et al.  A comprehensive survey of numeric and symbolic outlier mining techniques , 2006, Intell. Data Anal..

[361]  K. Berridge Motivation concepts in behavioral neuroscience , 2004, Physiology & Behavior.

[362]  D. Thistlethwaite A critical review of latent learning and related experiments. , 1951, Psychological bulletin.

[363]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[364]  David H. Ackley,et al.  Adaptation in Constant Utility Non-Stationary Environments , 1991, ICGA.

[365]  Stefano Nolfi,et al.  Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems , 1998, Neural Networks.

[366]  Pierre-Yves Oudeyer,et al.  The interaction of maturational constraints and intrinsic motivations in active motor development , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[367]  Petra Stoerig,et al.  Blindsight, conscious vision, and the role of primary visual cortex. , 2006, Progress in brain research.

[368]  Ethan S. Bromberg-Martin,et al.  Multiple Timescales of Memory in Lateral Habenula and Dopamine Neurons , 2010, Neuron.

[369]  C. S. Wallace,et al.  Estimation and Inference by Compact Coding , 1987 .

[370]  Kathryn E. Merrick Modelling Affordances for the Control and Evaluation of Intrinsically Motivated Robots , 2009 .

[371]  T. Robbins,et al.  Neural systems of reinforcement for drug addiction: from actions to habits to compulsion , 2005, Nature Neuroscience.

[372]  Francesco Mannella,et al.  Intrinsically motivated action-outcome learning and goal-based action recall: a system-level bio-constrained computational model. , 2013, Neural networks : the official journal of the International Neural Network Society.

[373]  Marco Mirolli,et al.  Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: A simulated robotic study , 2013, Neural Networks.

[374]  David Andre,et al.  Model based Bayesian Exploration , 1999, UAI.

[375]  T. Martin McGinnity,et al.  Novelty Detection as an Intrinsic Motivation for Cumulative Learning Robots , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[376]  J. Piaget The child's construction of reality , 1954 .

[377]  Luigi F. Agnati,et al.  The emergence of the volume transmission concept 1 Published on the World Wide Web on 12 January 1998. 1 , 1998, Brain Research Reviews.

[378]  A. Philip McMahon,et al.  The Principles of Art , 1939 .

[379]  D. Koditschek,et al.  Robot navigation functions on manifolds with boundary , 1990 .

[380]  K. Fischer,et al.  Stages and Individual Differences in Cognitive Development , 1985 .

[381]  Thomas J. Walsh,et al.  Knows what it knows: a framework for self-aware learning , 2008, ICML '08.

[382]  Sidney S. Simon,et al.  Merging of the Senses , 2008, Front. Neurosci..

[383]  Y. Agid,et al.  Reduction of cortical dopamine, noradrenaline, serotonin and their metabolites in Parkinson's disease , 1983, Brain Research.

[384]  Mark B. Ring Toward a Formal Framework for Continual Learning , 2005 .

[385]  M. Jüptner,et al.  A review of differences between basal ganglia and cerebellar control of movements as revealed by functional imaging studies. , 1998, Brain : a journal of neurology.

[386]  K. Montgomery The role of the exploratory drive in learning. , 1954, Journal of comparative and physiological psychology.

[387]  Ursula Klingmüller,et al.  Simulation Methods for Optimal Experimental Design in Systems Biology , 2003, Simul..

[388]  Peter Dayan,et al.  Exploration from Generalization Mediated by Multiple Controllers , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[389]  J. Lockman A perception--action perspective on tool use development. , 2000, Child development.

[390]  Yoshihiko Nakamura,et al.  Advanced robotics - redundancy and optimization , 1990 .

[391]  M. Tsodyks,et al.  Synaptic Theory of Working Memory , 2008, Science.

[392]  Stephen R. Marsland,et al.  A Real-Time Novelty Detector for a Mobile Robot , 2000, ArXiv.

[393]  H. Bergman,et al.  Information processing, dimensionality reduction and reinforcement learning in the basal ganglia , 2003, Progress in Neurobiology.

[394]  C. H. Honzik,et al.  Degrees of hunger, reward and non-reward, and maze learning in rats, and Introduction and removal of reward, and maze performance in rats , 1930 .

[395]  Emre Ugur,et al.  Learning Object Affordances for Planning , 2009 .

[396]  K. Doya,et al.  The computational neurobiology of learning and reward , 2006, Current Opinion in Neurobiology.

[397]  R. Oades,et al.  Ventral tegmental (A10) system: neurobiology. 1. Anatomy and connectivity , 1987, Brain Research Reviews.

[398]  Mark H. Johnson Functional brain development in humans , 2001, Nature Reviews Neuroscience.

[399]  R. Dolan,et al.  Reward Facilitates Tactile Judgments and Modulates Hemodynamic Responses in Human Primary Somatosensory Cortex , 2008, The Journal of Neuroscience.

[400]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[401]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993 .

[402]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[403]  L. Swanson,et al.  The projections of the ventral tegmental area and adjacent regions: A combined fluorescent retrograde tracer and immunofluorescence study in the rat , 1982, Brain Research Bulletin.

[404]  R. Wightman,et al.  Dopamine Operates as a Subsecond Modulator of Food Seeking , 2004, The Journal of Neuroscience.

[405]  G. Baldassarre,et al.  Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[406]  B. K. Hartman,et al.  The central adrenergic system. An immunofluorescence study of the location of cell bodies and their efferent connections in the rat utilizing dopamine‐B‐hydroxylase as a marker , 1975, The Journal of comparative neurology.

[407]  J. Bolam,et al.  Novel and Distinct Operational Principles of Intralaminar Thalamic Neurons and Their Striatal Projections , 2007, The Journal of Neuroscience.

[408]  Stephen Hart,et al.  Intrinsically motivated hierarchical manipulation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[409]  D. Maurer,et al.  Developmental changes in the scanning of faces by young infants. , 1976, Child development.

[410]  J. Mink THE BASAL GANGLIA: FOCUSED SELECTION AND INHIBITION OF COMPETING MOTOR PROGRAMS , 1996, Progress in Neurobiology.

[411]  P. Dean,et al.  Event or emergency? Two response systems in the mammalian superior colliculus , 1989, Trends in Neurosciences.

[412]  T. Dennis,et al.  Increase in dopamine and DOPAC levels in noradrenergic terminals after electrical stimulation of the ascending noradrenergic pathways , 1984, Brain Research.

[413]  Kathryn E. Merrick Designing Toys That Come Alive: Curious Robots for Creative Play , 2008, ICEC.

[414]  Lena H Ting,et al.  Neuromechanics of muscle synergies for posture and movement , 2007, Current Opinion in Neurobiology.

[415]  R. Wise Dopamine, learning and motivation , 2004, Nature Reviews Neuroscience.

[416]  J. Kalaska,et al.  Neural mechanisms for interacting with a world full of action choices. , 2010, Annual review of neuroscience.

[417]  P. Redgrave,et al.  Nociceptive responses of midbrain dopaminergic neurones are modulated by the superior colliculus in the rat , 2006, Neuroscience.

[418]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[419]  Brian Knutson,et al.  Reward-Motivated Learning: Mesolimbic Activation Precedes Memory Formation , 2006, Neuron.

[420]  G. Gessa,et al.  Alpha2‐adrenoceptor mediated co‐release of dopamine and noradrenaline from noradrenergic neurons in the cerebral cortex , 2004, Journal of neurochemistry.

[421]  R. Morris,et al.  Making memories last: the synaptic tagging and capture hypothesis , 2010, Nature Reviews Neuroscience.

[422]  P. Greengard,et al.  Beyond the Dopamine Receptor: Review the DARPP-32/Protein Phosphatase-1 Cascade , 1999 .

[423]  Jürgen Schmidhuber,et al.  Exploring the predictable , 2003 .

[424]  F. Goodkin Rats learn the relationship between responding and environmental events: An expansion of the learned helplessness hypothesis , 1976 .

[425]  F. Guarraci,et al.  An electrophysiological characterization of ventral tegmental area dopaminergic neurons during differential pavlovian fear conditioning in the awake rabbit , 1999, Behavioural Brain Research.

[426]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[427]  Stephen R. Marsland,et al.  A tale of two filters-on-line novelty detection , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[428]  Jürgen Schmidhuber,et al.  HQ-Learning , 1997, Adapt. Behav..

[429]  M. Geffard,et al.  Immunocytochemical localization of dopamine in the prefrontal cortex of the rat at the light and electron microscopical level , 1987, Neuroscience.

[430]  J. Deniau,et al.  Disinhibition as a basic process in the expression of striatal functions , 1990, Trends in Neurosciences.

[431]  J. Kagan,et al.  The growth of memory during infancy. , 1979, Genetic psychology monographs.

[432]  M. Geffard,et al.  Antisera against catecholamines: specificity studies and physicochemical data for anti-dopamine and anti-p-tyramine antibodies. , 1984, Molecular immunology.

[433]  C. L. Hull Principles of behavior : an introduction to behavior theory , 1943 .

[434]  S. Marsland Novelty Detection in Learning Systems , 2008 .

[435]  Marco Mirolli,et al.  Deciding Which Skill to Learn When: Temporal-Difference Competence-Based Intrinsic Motivation (TD-CB-IM) , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[436]  E. Izhikevich Solving the distal reward problem through linkage of STDP and dopamine signaling , 2007, BMC Neuroscience.

[437]  S. Grillner,et al.  Mechanisms for selection of basic motor programs – roles for the striatum and pallidum , 2005, Trends in Neurosciences.

[438]  M. Haith,et al.  Infants' acquisition of spatiotemporal expectations. , 1998, Developmental psychology.

[439]  B. Berger,et al.  Morphological evidence for a dopaminergic terminal field in the hippocampal formation of young and adult rat , 1985, Neuroscience.

[440]  Philip L. Smith,et al.  Attention orienting and the time course of perceptual decisions: response time distributions with masked and unmasked displays , 2004, Vision Research.

[441]  Scott P. Johnson,et al.  Where Infants Look Determines How They See: Eye Movements and Object Perception Performance in 3-Month-Olds. , 2004, Infancy : the official journal of the International Society on Infant Studies.

[442]  Jürgen Schmidhuber,et al.  Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.

[443]  Tai Sing Lee,et al.  Contextual Influences in Visual Processing , 2008 .

[444]  N. Krüger,et al.  Autonomous Learning of Object-specific Grasp Affordance Densities , 2009 .

[445]  Raymond J. Dolan,et al.  Anticipation of novelty recruits reward system and hippocampus while promoting recollection , 2007, NeuroImage.

[446]  H. Harlow Learning and satiation of response in intrinsically motivated complex puzzle performance by monkeys. , 1950, Journal of comparative and physiological psychology.

[447]  B. Skinner Contingencies of reinforcement : a theoretical analysis , 1969 .

[448]  Hugo Vieira Neto,et al.  Visual attention and novelty detection: experiments with automatic scale selection , 2006 .

[449]  D. Blei,et al.  Context, learning, and extinction. , 2010, Psychological review.

[450]  Bernd Fritzke,et al.  Growing cell structures--A self-organizing network for unsupervised and supervised learning , 1994, Neural Networks.

[451]  Y. Agid,et al.  Dopamine deficiency in the cerebral cortex in Parkinson disease , 1982, Neurology.

[452]  Hugo Vieira Neto,et al.  Visual novelty detection with automatic scale selection , 2007, Robotics Auton. Syst..

[453]  W. Schultz Dopamine signals for reward value and risk: basic and recent data , 2010, Behavioral and Brain Functions.

[454]  Brian Scassellati,et al.  Learning acceptable windows of contingency , 2006, Connect. Sci..

[455]  Andrew G. Barto,et al.  PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.

[456]  M. Schlesinger Heterochrony: It's (all) about time! , 2008 .

[457]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[458]  Konrad Paul Kording,et al.  Review TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2022 .

[459]  Mark B. Ring Continual learning in reinforcement environments , 1995, GMD-Bericht.

[460]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[461]  M Schleidt,et al.  Segmentation in behavior and what it can tell us about brain function , 1997, Human nature.

[462]  Marcus Hutter,et al.  Strong Asymptotic Assertions for Discrete MDL in Regression and Classification , 2005, ArXiv.

[463]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[464]  Wynne A. Lee,et al.  Neuromotor synergies as a basis for coordinated intentional action. , 1984, Journal of motor behavior.

[465]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[466]  Jürgen Schmidhuber,et al.  Planning simple trajectories using neural subgoal generators , 1993 .

[467]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[468]  Matthew Schlesinger,et al.  Investigating the Origins of Intrinsic Motivation in Human Infants , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[469]  O. Hikosaka,et al.  Influence of reward expectation on visuospatial processing in macaque lateral prefrontal cortex. , 2002, Journal of neurophysiology.

[470]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[471]  G. Turkewitz,et al.  The Role of Developmental Limitations of Sensory Input on Sensory/Perceptual Organization , 1985, Journal of developmental and behavioral pediatrics : JDBP.

[472]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[473]  Jan Peters,et al.  Model Learning in Robotics: a Survey , 2011 .

[474]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[475]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[476]  Bram Bakker,et al.  Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .

[477]  O. Hikosaka,et al.  Dopamine Neurons Can Represent Context-Dependent Prediction Error , 2004, Neuron.

[478]  L. Descarries,et al.  Distribution and Morphological Characteristics of Dopamine‐Immunoreactive Neurons in the Midbrain of the Squirrel Monkey (Saimiri sciureus) , 1988, The Journal of comparative neurology.

[479]  Mark G. Packard,et al.  Anterograde and retrograde tracing of projections from the ventral tegmental area to the hippocampal formation in the rat , 1994, Brain Research Bulletin.

[480]  Larry Stein,et al.  Reinforcement delay of one second severely impairs acquisition of brain self-stimulation , 1985, Brain Research.

[481]  David S. Touretzky,et al.  Similarity and Discrimination in Classical Conditioning: A Latent Variable Account , 2004, NIPS.

[482]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[483]  Jürgen Schmidhuber,et al.  A Formal Theory of Creativity to Model the Creation of Art , 2012 .

[484]  J. Bolam,et al.  Activity of Neurochemically Heterogeneous Dopaminergic Neurons in the Substantia Nigra during Spontaneous and Driven Changes in Brain State , 2009, The Journal of Neuroscience.

[485]  H. Fibiger,et al.  Cortical Regulation of Subcortical Dopamine Release: Mediation via the Ventral Tegmental Area , 1995, Journal of neurochemistry.

[486]  D. Sofge THE ROLE OF EXPLORATION IN LEARNING CONTROL , 1992 .

[487]  E. Bushnell,et al.  Motor development and the mind: the potential role of motor abilities as a determinant of aspects of perceptual development. , 1993, Child development.

[488]  Hugo Vieira Neto,et al.  Visual Novelty Detection for Inspection Tasks using Mobile Robots , 2004 .

[489]  J. C. Stanley Computer simulation of a model of habituation , 1976, Nature.

[490]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[491]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[492]  Stephen Hart,et al.  An intrinsic reward for affordance exploration , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[493]  R. W. White Motivation reconsidered: the concept of competence. , 1959, Psychological review.

[494]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[495]  S. Maier,et al.  Stressor controllability and learned helplessness: The roles of the dorsal raphe nucleus, serotonin, and corticotropin-releasing factor , 2005, Neuroscience & Biobehavioral Reviews.

[496]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[497]  Yael Niv,et al.  OPERANT CONDITIONING , 1974, Scholarpedia.

[498]  Sham M. Kakade,et al.  Opponent interactions between serotonin and dopamine , 2002, Neural Networks.

[499]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[500]  Jutta Heckhausen,et al.  Motivation and action , 1991 .

[501]  Garrison W. Cottrell,et al.  Learning Mackey-Glass from 25 Examples, Plus or Minus 2 , 1993, NIPS.

[502]  Bernd Fritzke Incremental Learning of Local Linear Mappings , 1995 .

[503]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[504]  F. E. Grubbs Procedures for Detecting Outlying Observations in Samples , 1969 .

[505]  M. VanElzakker,et al.  Environmental novelty is associated with a selective increase in Fos expression in the output elements of the hippocampal formation and the perirhinal cortex. , 2008, Learning & memory.

[506]  P. Holland Amount of training affects associatively-activated event representation , 1998, Neuropharmacology.

[507]  Tomaso Poggio,et al.  From Understanding Computation to Understanding Neural Circuitry , 1976 .

[508]  Thomas E. Hazy,et al.  Towards an executive without a homunculus: computational models of the prefrontal cortex/basal ganglia system , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[509]  Jürgen Schmidhuber,et al.  Gödel Machines: Fully Self-referential Optimal Universal Self-improvers , 2007, Artificial General Intelligence.

[510]  Scott P. Johnson,et al.  The Neural Basis for Visual Selective Attention in Young Infants: A Computational Account , 2007, Adapt. Behav..

[511]  O Hikosaka,et al.  GABAergic output of the basal ganglia. , 2007, Progress in brain research.

[512]  M. Banks,et al.  Optical and photoreceptor immaturities limit the spatial and chromatic vision of human neonates. , 1988, Journal of the Optical Society of America. A, Optics and image science.

[513]  Hans-Jochen Heinze,et al.  Novel Scenes Improve Recollection and Recall of Words , 2008, Journal of Cognitive Neuroscience.

[514]  Pierre-Yves Oudeyer,et al.  Intelligent Adaptive Curiosity: a source of Self-Development , 2004 .

[515]  Peter Redgrave,et al.  A direct projection from superior colliculus to substantia nigra for detecting salient visual events , 2003, Nature Neuroscience.

[516]  Jürgen Schmidhuber,et al.  Maximizing Fun by Creating Data with Easily Reducible Subjective Complexity , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[517]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[518]  J. Wickens,et al.  Space, time and dopamine , 2007, Trends in Neurosciences.

[519]  Herbert A. Simon,et al.  The Sciences of the Artificial , 1970 .

[520]  E. Düzel,et al.  Personality Traits Are Differentially Associated with Patterns of Reward and Novelty Processing in the Human Substantia Nigra/Ventral Tegmental Area , 2009, Biological Psychiatry.

[521]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[522]  Ludovic Righetti,et al.  Toward simple control for complex, autonomous robotic applications: combining discrete and rhythmic motor primitives , 2011, Auton. Robots.

[523]  Randy D. Blakely,et al.  Expression cloning of a cocaine-and antidepressant-sensitive human noradrenaline transporter , 1991, Nature.

[524]  H. Heinze,et al.  Reward-Related fMRI Activation of Dopaminergic Midbrain Is Associated with Enhanced Hippocampus- Dependent Long-Term Memory Formation , 2005, Neuron.

[525]  K. Breland,et al.  The misbehavior of organisms. , 1961 .

[526]  N. Berthier,et al.  Proximodistal structure of early reaching in human infants , 1999, Experimental Brain Research.

[527]  J. O’Neill,et al.  Place-selective firing contributes to the reverse-order reactivation of CA1 pyramidal cells during sharp waves in open-field exploration , 2007, The European journal of neuroscience.

[528]  A. Stoytchev Toward Learning the Binding Affordances of Objects : A Behavior-Grounded Approach , 2022 .

[529]  S. Sesack,et al.  Glutamate synaptic inputs to ventral tegmental area neurons in the rat derive primarily from subcortical sources , 2007, Neuroscience.

[530]  Kathryn E. Merrick,et al.  Motivated Reinforcement Learning - Curious Characters for Multiuser Games , 2009 .

[531]  Vincent P Ferrera,et al.  Internally Generated Error Signals in Monkey Frontal Eye Field during an Inferred Motion Task , 2010, The Journal of Neuroscience.

[532]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[533]  Giorgio Metta,et al.  Incremental learning of robot dynamics using random features , 2011, 2011 IEEE International Conference on Robotics and Automation.

[534]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[535]  Pierre-Yves Oudeyer,et al.  The Discovery of Communication , 2006 .

[536]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[537]  Marshall M. Haith,et al.  The formation of expectations in early infancy , 1993 .

[538]  J. Horvitz Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events , 2000, Neuroscience.

[539]  Jochen J. Steil,et al.  Goal Babbling Permits Direct Learning of Inverse Kinematics , 2010, IEEE Transactions on Autonomous Mental Development.

[540]  John S. Gero,et al.  Curious agents and situated design evaluations , 2004, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[541]  A. Dickinson,et al.  Neuronal coding of prediction errors. , 2000, Annual review of neuroscience.

[542]  Giorgio Metta,et al.  Methods and Technologies for the Implementation of Large-Scale Robot Tactile Sensors , 2011, IEEE Transactions on Robotics.

[543]  Hugo Vieira Neto,et al.  Automated Exploration and Inspection: Comparing Two Visual Novelty Detectors , 2005 .

[544]  Pierre-Yves Oudeyer,et al.  Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[545]  U. Frey,et al.  Synaptic tagging: implications for late maintenance of hippocampal long-term potentiation , 1998, Trends in Neurosciences.

[546]  J L Dannemiller Competition in early exogenous orienting between 7 and 21 weeks. , 2000, Journal of experimental child psychology.

[547]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[548]  Jenq-Neng Hwang,et al.  Query-based learning applied to partially trained multilayer perceptrons , 1991, IEEE Trans. Neural Networks.

[549]  András Lörincz,et al.  The many faces of optimism: a unifying approach , 2008, ICML '08.

[550]  Pierre-Yves Oudeyer,et al.  Exploring robust, intuitive and emergent physical human-robot interaction with the humanoid robot Acroban , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[551]  P. Soubrié Reconciling the role of central serotonin neurons in human and animal behavior , 1986, Behavioral and Brain Sciences.

[552]  Mitsuo Kawato,et al.  Inter-module credit assignment in modular reinforcement learning , 2003, Neural Networks.

[553]  Scott P. Johnson,et al.  Simulating Infants' Gaze Patterns during the Development of Perceptual Completion , 2007 .

[554]  J. Bolam,et al.  Uniform Inhibition of Dopamine Neurons in the Ventral Tegmental Area by Aversive Stimuli , 2004, Science.

[555]  Pierre-Yves Oudeyer,et al.  Maturationally-constrained competence-based intrinsically motivated learning , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[556]  Tom Schaul,et al.  Coherence Progress: A Measure of Interestingness Based on Fixed Compressors , 2011, AGI.

[557]  C. Von Hofsten An action perspective on motor development. , 2004, Trends in cognitive sciences.

[558]  Giovanni Pezzulo,et al.  Learning to Look in Different Environments: An Active-Vision Model Which Learns and Readapts Visual Routines , 2010, SAB.

[559]  Tom Schaul,et al.  Fitness Expectation Maximization , 2008, PPSN.

[560]  R. Wightman,et al.  Real-time measurement of dopamine fluctuations after cocaine in the brain of behaving rats. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[561]  R. Dolan,et al.  Contextual Novelty Changes Reward Representations in the Striatum , 2010, The Journal of Neuroscience.

[562]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[563]  Ethem Alpaydin,et al.  Simplified ART: A new class of ART algorithms , 1998 .

[564]  W. Schultz Multiple dopamine functions at different time courses. , 2007, Annual review of neuroscience.

[565]  O. Hikosaka,et al.  Comparison of Reward Modulation in the Frontal Eye Field and Caudate of the Macaque , 2006, The Journal of Neuroscience.

[566]  Antonio Bicchi,et al.  An atlas of physical human-robot interaction , 2008 .

[567]  Masaki Ogino,et al.  Cognitive Developmental Robotics: A Survey , 2009, IEEE Transactions on Autonomous Mental Development.

[568]  P. Dayan,et al.  Serotonin in affective control. , 2009, Annual review of neuroscience.

[569]  A. Needham,et al.  A pick-me-up for infants’ exploratory skills: Early simulated experiences reaching for objects using ‘sticky mittens’ enhances young infants’ object exploration skills , 2002 .

[570]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[571]  Jürgen Schmidhuber,et al.  New Millennium AI and the Convergence of History: Update of 2012 , 2012 .

[572]  Michael R. Zinn,et al.  A New Actuation Approach for Human Friendly Robot Design , 2004, Int. J. Robotics Res..

[573]  W. Welker,et al.  Some determinants of play and exploration in chimpanzees. , 1956, Journal of comparative and physiological psychology.

[574]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[575]  Oliver Speck,et al.  Midbrain fMRI: Applications, Limitations and Challenges , 2015 .

[576]  Kai A. Krueger,et al.  Flexible shaping: How learning in small steps helps , 2009, Cognition.

[577]  P. Redgrave,et al.  Is the short-latency dopamine response too short to signal reward error? , 1999, Trends in Neurosciences.

[578]  Stephen Grossberg,et al.  The ART of adaptive pattern recognition by a self-organizing neural network , 1988, Computer.

[579]  P. Dayan,et al.  Opponency Revisited: Competition and Cooperation Between Dopamine and Serotonin , 2010, Neuropsychopharmacology.

[580]  W. Schultz Getting Formal with Dopamine and Reward , 2002, Neuron.

[581]  Fulvio Mastrogiovanni,et al.  Skin spatial calibration using force/torque measurements , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[582]  Ethan S. Bromberg-Martin,et al.  Distinct Tonic and Phasic Anticipatory Activity in Lateral Habenula and Dopamine Neurons , 2010, Neuron.

[583]  S. Yantis,et al.  Visual Attention: Bottom-Up Versus Top-Down , 2004, Current Biology.

[584]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[585]  M. Guitart-Masip,et al.  NOvelty-related Motivation of Anticipation and exploration by Dopamine (NOMAD): Implications for healthy aging , 2010, Neuroscience & Biobehavioral Reviews.

[586]  H. Barlow Vision: A computational investigation into the human representation and processing of visual information: David Marr. San Francisco: W. H. Freeman, 1982. pp. xvi + 397 , 1983 .

[587]  Richard L. Lewis,et al.  Where Do Rewards Come From , 2009 .

[588]  T. Poggio,et al.  Ill-posed problems in early vision: from computational theory to analogue networks , 1985, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[589]  N. Lemon,et al.  Dopamine D1/D5 Receptors Gate the Acquisition of Novel Information through Hippocampal Long-Term Potentiation and Long-Term Depression , 2006, The Journal of Neuroscience.

[590]  Sebastian Thrun,et al.  Finding Structure in Reinforcement Learning , 1994, NIPS.

[591]  C. Padoa-Schioppa,et al.  Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.

[592]  O. Hikosaka,et al.  Representation of negative motivational value in the primate lateral habenula , 2009, Nature Neuroscience.

[593]  J. E. Albano,et al.  Visual-motor function of the primate superior colliculus. , 1980, Annual review of neuroscience.

[594]  Emrah Düzel,et al.  The Hippocampal-VTA Loop: The Role of Novelty and Motivation in Controlling the Entry of Information into Long-Term Memory , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[595]  Edward T. Bullmore,et al.  Modular and Hierarchically Modular Organization of Brain Networks , 2010, Front. Neurosci..

[596]  Stephen R. Marsland,et al.  A self-organising network that grows when required , 2002, Neural Networks.

[597]  Stephen R. Marsland,et al.  On-line novelty detection for autonomous mobile robots , 2005, Robotics Auton. Syst..

[598]  R. Andersen,et al.  Coding of intention in the posterior parietal cortex , 1997, Nature.

[599]  Jürgen Schmidhuber,et al.  Hierarchies of Generalized Kolmogorov Complexities and Nonenumerable Universal Measures Computable in the Limit , 2002, Int. J. Found. Comput. Sci..

[600]  R. Wise,et al.  Linking Context with Reward: A Functional Circuit from Hippocampal CA3 to Ventral Tegmental Area , 2011, Science.

[601]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[602]  J. Wickens,et al.  Neural mechanisms of reward-related motor learning , 2003, Current Opinion in Neurobiology.

[603]  Andrew G. Barto,et al.  Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[604]  Pierre-Yves Oudeyer,et al.  Incremental local online Gaussian Mixture Regression for imitation learning of multiple tasks , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[605]  Aude Billard,et al.  Iterative learning of grasp adaptation through human corrections , 2012, Robotics Auton. Syst..

[606]  P. Redgrave,et al.  What is reinforced by phasic dopamine signals? , 2008, Brain Research Reviews.

[607]  E. Skinner A guide to constructs of control. , 1996, Journal of personality and social psychology.

[608]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[609]  Andrew G. Barto,et al.  Intrinsically Motivated Reinforcement Learning: A Promising Framework for Developmental Robot Learning , 2005 .

[610]  B. Hommel,et al.  Contiguity and contingency in action-effect learning , 2004, Psychological research.

[611]  D. Lewkowicz,et al.  A dynamic systems approach to the development of cognition and action. , 2007, Journal of cognitive neuroscience.

[612]  Jürgen Schmidhuber,et al.  An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[613]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[614]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[615]  P. Corr,et al.  A two-dimensional neuropsychology of defense: fear/anxiety and defensive distance , 2004, Neuroscience & Biobehavioral Reviews.

[616]  P. Dayan,et al.  Behavioral/systems/cognitive Action Dominates Valence in Anticipatory Representations in the Human Striatum and Dopaminergic Midbrain , 2010 .

[617]  Douglas B. Lenat,et al.  Theory Formation by Heuristic Search , 1983, Artificial Intelligence.

[618]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[619]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[620]  S. Maier,et al.  Behavioral control, the medial prefrontal cortex, and resilience , 2006, Dialogues in clinical neuroscience.

[621]  Pierre-Yves Oudeyer,et al.  Acroban the humanoid: playful and compliant physical child-robot interaction , 2010, SIGGRAPH '10.

[622]  Jürgen Schmidhuber,et al.  Simple Algorithmic Principles of Discovery, Subjective Beauty, Selective Attention, Curiosity & Creativity , 2007, Discovery Science.

[623]  G. Rainer,et al.  Cognitive neuroscience: Neural mechanisms for detecting and remembering novel events , 2003, Nature Reviews Neuroscience.

[624]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[625]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[626]  G. Baldassarre,et al.  Functions and Mechanisms of Intrinsic Motivations The Knowledge Versus Competence Distinction , 2012 .

[627]  Christophe Giraud-Carrier,et al.  High Capacity Neural Networks for Familiarity Discrimination , 1999 .

[628]  T. Hökfelt,et al.  The origin of the dopamine nerve terminals in limbic and frontal cortex. Evidence for meso-cortico dopamine neurons. , 1974, Brain research.

[629]  L. Festinger A Theory of Cognitive Dissonance , 1957 .

[630]  K. Gurney,et al.  A Physiologically Plausible Model of Action Selection and Oscillatory Activity in the Basal Ganglia , 2006, The Journal of Neuroscience.

[631]  P. Redgrave,et al.  The basal ganglia: a vertebrate solution to the selection problem? , 1999, Neuroscience.

[632]  Francesco Lacquaniti,et al.  Control of Fast-Reaching Movements by Muscle Synergy Combinations , 2006, The Journal of Neuroscience.

[633]  Y. Niv,et al.  Learning latent structure: carving nature at its joints , 2010, Current Opinion in Neurobiology.

[634]  Satinder Singh Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.

[635]  K. Berridge,et al.  Fear and Feeding in the Nucleus Accumbens Shell: Rostrocaudal Segregation of GABA-Elicited Defensive Behavior Versus Eating Behavior , 2001, The Journal of Neuroscience.

[636]  Ales Leonardis,et al.  Incremental PCA for on-line visual learning and recognition , 2002, Object recognition supported by user interaction for service robots.

[637]  Lisa Meeden,et al.  Category-based intrinsic motivation , 2009, EpiRob.

[638]  Scott P. Johnson Development of Perceptual Completion in Infancy , 2004, Psychological science.

[639]  Maria Lindskog,et al.  Dopamine in the hippocampus is cleared by the norepinephrine transporter. , 2011, The international journal of neuropsychopharmacology.

[640]  Jürgen Schmidhuber,et al.  Sequential neural text compression , 1996, IEEE Trans. Neural Networks.

[641]  G. Flore,et al.  Evidence for co-release of noradrenaline and dopamine from noradrenergic neurons in the cerebral cortex , 2001, Molecular Psychiatry.

[642]  J. Eyre,et al.  Development and Plasticity of the Corticospinal System in Man , 2003, Neural plasticity.

[643]  G. Gessa,et al.  Co-release of noradrenaline and dopamine in the cerebral cortex elicited by single train and repeated train stimulation of the locus coeruleus , 2005, BMC Neuroscience.

[644]  Richard Sproat,et al.  Morphology and computation , 1992 .

[645]  Michael L. Littman,et al.  Multi-resolution Exploration in Continuous Spaces , 2008, NIPS.

[646]  Lorenz T. Biegler,et al.  On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming , 2006, Math. Program..

[647]  Jurgen Schmidhuber,et al.  Artificial curiosity with planning for autonomous perceptual and cognitive development , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[648]  Active Learning and Intrinsically Motivated Exploration in Robots : Advances and Challenges , 2010 .

[649]  G. Bronson The postnatal growth of visual capacity. , 1974, Child development.

[650]  R. Greene,et al.  CNS Dopamine Transmission Mediated by Noradrenergic Innervation , 2012, The Journal of Neuroscience.

[651]  E. Vaadia,et al.  Coincident but Distinct Messages of Midbrain Dopamine and Striatal Tonically Active Neurons , 2004, Neuron.

[652]  J. Kagan Motives and development. , 1972, Journal of personality and social psychology.

[653]  G. B. Kish Learning when the onset of illumination is used as reinforcing stimulus. , 1955, Journal of comparative and physiological psychology.

[654]  John C Gore,et al.  The role of the parietal cortex in visual feature binding , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[655]  Peter Dayan,et al.  Structure in the Space of Value Functions , 2002, Machine Learning.

[656]  Thomas R. Gruber,et al.  A Translation Approach to Portable Ontologies , 1993 .

[657]  Mattias P. Karlsson,et al.  Network Dynamics Underlying the Formation of Sparse, Informative Representations in the Hippocampus , 2008, The Journal of Neuroscience.

[658]  Matthew Schlesinger,et al.  Decomposing infants’ object representations: A dual-route processing account , 2006, Connect. Sci..

[659]  Peter Stone,et al.  Towards autonomous sensor and actuator model induction on a mobile robot , 2006, Connect. Sci..

[660]  Jonathan M Chambers,et al.  Object-based biasing for attentional control of gaze: a comparison of biologically plausible mechanisms , 2009, BMC Neuroscience.

[661]  Pierre Baldi,et al.  Of bits and wows: A Bayesian theory of surprise with applications to attention , 2010, Neural Networks.

[662]  W. McD. Grundzüge der physiologischen Psychologie , 1902, Nature.

[663]  Mary Lou and Gero John S. Maher,et al.  Agent Models of 3D Virtual Worlds , 2002 .

[664]  Jürgen Schmidhuber,et al.  A local learning algorithm for dynamic feedforward and recurrent networks , 1990, Forschungsberichte, TU Munich.

[665]  Kenji Doya,et al.  Metalearning and neuromodulation , 2002, Neural Networks.

[666]  J. Salamone,et al.  Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine , 2002, Behavioural Brain Research.

[667]  R. Morris,et al.  Dopamine and Memory: Modulation of the Persistence of Memory for Novel Hippocampal NMDA Receptor-Dependent Paired Associates , 2010, The Journal of Neuroscience.

[668]  D. Berlyne NOVELTY AND CURIOSITY AS DETERMINANTS OF EXPLORATORY BEHAVIOUR1 , 1950 .

[669]  S. Mizumori,et al.  Ventral tegmental area and substantia nigra neural correlates of spatial learning. , 2011, Learning & memory.

[670]  Angelo Cangelosi,et al.  An open-source simulator for cognitive robotics research: the prototype of the iCub humanoid robot simulator , 2008, PerMIS.

[671]  Pierre-Yves Oudeyer,et al.  On the Impact of Robotics in Behavioral and Cognitive Sciences: From Insect Navigation to Human Cognitive Development , 2010, IEEE Transactions on Autonomous Mental Development.

[672]  Antonio Bicchi,et al.  Integration of active and passive compliance control for safe human-robot coexistence , 2009, 2009 IEEE International Conference on Robotics and Automation.

[673]  Giulio Sandini,et al.  An experimental evaluation of a novel minimum-jerk cartesian controller for humanoid robots , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[674]  John S. Gero,et al.  Creativity, emergence and evolution in design , 1996, Knowl. Based Syst..

[675]  B. Balleine Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits , 2005, Physiology & Behavior.

[676]  Robert E Hampson,et al.  Differential but Complementary Mnemonic Functions of the Hippocampus and Subiculum , 2004, Neuron.

[677]  Pierre-Yves Oudeyer,et al.  What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.

[678]  Daniel H. Grollman,et al.  Incremental learning of subtasks from unsegmented demonstration , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[679]  A. Tversky,et al.  The framing of decisions and the psychology of choice. , 1981, Science.

[680]  Okihide Hikosaka,et al.  Reward-Dependent Gain and Bias of Visual Responses in Primate Superior Colliculus , 2003, Neuron.

[681]  M. Csíkszentmihályi Creativity: Flow and the Psychology of Discovery and Invention , 1996 .

[682]  Anthony Dickinson,et al.  The 28th Bartlett Memorial Lecture Causal Learning: An Associative Analysis , 2001, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[683]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[684]  J. Horvitz,et al.  Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat , 1997, Brain Research.

[685]  E. Tolman Cognitive maps in rats and men. , 1948, Psychological review.

[686]  P. W. Jones,et al.  Bandit Problems, Sequential Allocation of Experiments , 1987 .

[687]  N. Franceschini,et al.  From insect vision to robot vision , 1992 .

[688]  M. Goldberg,et al.  The representation of visual salience in monkey parietal cortex , 1998, Nature.

[689]  D. Berlyne Conflict, arousal, and curiosity , 2014 .

[690]  José Santos-Victor,et al.  Sound Localization for Humanoid Robots - Building Audio-Motor Maps based on the HRTF , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[691]  H. Harlow,et al.  Learning motivated by a manipulation drive. , 1950, Journal of experimental psychology.

[692]  Yuval Shahar,et al.  Dynamic temporal interpretation contexts for temporal abstraction , 1998, Annals of Mathematics and Artificial Intelligence.

[693]  M. Geffard,et al.  Antibodies to Dopamine: Radioimmunological Study of Specificity in Relation to Immunocytochemistry , 1984, Journal of neurochemistry.

[694]  M.M. Deris,et al.  A Comparative Study for Outlier Detection Techniques in Data Mining , 2006, 2006 IEEE Conference on Cybernetics and Intelligent Systems.

[695]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[696]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[697]  Roderic A. Grupen,et al.  A control basis for multilegged walking , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[698]  J. Michael Herrmann,et al.  Learning predictive representations , 2000, Neurocomputing.

[699]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[700]  VARUN CHANDOLA,et al.  Outlier Detection : A Survey , 2007 .

[701]  J. Schmidhuber Science as By-Products of Search for Novel Patterns , or Data Compressible in Unknown Yet Learnable Ways , 2009 .

[702]  Lihong Li,et al.  Learning from Logged Implicit Exploration Data , 2010, NIPS.

[703]  Daniel Paredes,et al.  Neuregulin-1 regulates LTP at CA1 hippocampal synapses through activation of dopamine D4 receptors , 2008, Proceedings of the National Academy of Sciences.

[704]  W. Pan,et al.  Dopamine Cells Respond to Predicted Events during Classical Conditioning: Evidence for Eligibility Traces in the Reward-Learning Network , 2005, The Journal of Neuroscience.

[705]  J. Wagner,et al.  Dopamine transporter blockade increases LTP in the CA1 region of the rat hippocampus via activation of the D3 dopamine receptor. , 2006, Learning & memory.

[706]  O. Hikosaka,et al.  Lateral habenula as a source of negative reward signals in dopamine neurons , 2007, Nature.

[707]  Stephen Hart,et al.  The development of hierarchical knowledge in robot systems , 2009 .

[708]  W. C. Hall,et al.  Collateral projections of predorsal bundle cells of the superior colliculus in the rat , 1989, The Journal of comparative neurology.

[709]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[710]  E. Kandel,et al.  D1/D5 receptor agonists induce a protein synthesis-dependent late potentiation in the CA1 region of the hippocampus. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[711]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[712]  Kathryn E. Merrick,et al.  Motivated Learning from Interesting Events: Adaptive, Multitask Learning Agents for Complex Environments , 2009, Adapt. Behav..

[713]  Douglas S. Blank,et al.  An Emergent Framework For Self-Motivation In Developmental Robotics , 2004 .

[714]  R. Morris,et al.  Relevance of synaptic tagging and capture to the persistence of long-term potentiation and everyday spatial memory , 2010, Proceedings of the National Academy of Sciences.

[715]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[716]  R. Ivry,et al.  Cerebellar involvement in anticipating the consequences of self-produced actions during bimanual movements. , 2005, Journal of neurophysiology.

[717]  John N. J. Reynolds,et al.  Dopamine-dependent plasticity of corticostriatal synapses , 2002, Neural Networks.

[718]  Aude Billard,et al.  Reaching with multi-referential dynamical systems , 2008, Auton. Robots.

[719]  Martha Flanders,et al.  Muscular and postural synergies of the human hand. , 2004, Journal of neurophysiology.

[720]  Peter Dayan,et al.  Bilinearity, Rules, and Prefrontal Cortex , 2007, Frontiers Comput. Neurosci..

[721]  Christopher W. Geib,et al.  Title of the Deliverable: Publication about Multi-level Learning Sys- Tem Attachment 1 Attachment 2 a Formal Definition of Object-action Complexes and Examples at Different Levels of the Processing Hierarchy , 2022 .

[722]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[723]  G B KISH,et al.  Unconditioned operant behavior in two homozygous strains of mice. , 1956, The Journal of Genetic Psychology.

[724]  M. Fyhn,et al.  Hippocampal Neurons Responding to First-Time Dislocation of a Target Object , 2002, Neuron.

[725]  B. Balleine,et al.  Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.

[726]  Ring Mark,et al.  Compression Progress-Based Curiosity Drive for Developmental Learning , 2011 .

[727]  Giulio Sandini,et al.  The iCub Platform: A Tool for Studying Intrinsically Motivated Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[728]  D. Shohamy,et al.  Integrating Memories in the Human Brain: Hippocampal-Midbrain Encoding of Overlapping Events , 2008, Neuron.

[729]  R. F. Thompson,et al.  Habituation: a model phenomenon for the study of neuronal substrates of behavior. , 1966, Psychological review.

[730]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[731]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[732]  Peter Redgrave,et al.  A computational model of action selection in the basal ganglia. I. A new functional anatomy , 2001, Biological Cybernetics.

[733]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[734]  D. Parisi,et al.  TRoPICALS: a computational embodied neuroscience model of compatibility effects. , 2010, Psychological review.

[735]  D. Berlyne A theory of human curiosity. , 1954, British journal of psychology.

[736]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.