Hierarchical curiosity loops and active sensing

A curious agent acts so as to optimize its learning about itself and its environment, without external supervision. We present a model of hierarchical curiosity loops for such an autonomous active learning agent, whereby each loop selects the optimal action that maximizes the agent's learning of sensory-motor correlations. The model is based on rewarding the learner's prediction errors in an actor-critic reinforcement learning (RL) paradigm. Hierarchy is achieved by utilizing previously learned motor-sensory mapping, which enables the learning of other mappings, thus increasing the extent and diversity of knowledge and skills. We demonstrate the relevance of this architecture to active sensing using the well-studied vibrissae (whiskers) system, where rodents acquire sensory information by virtue of repeated whisker movements. We show that hierarchical curiosity loops starting from optimally learning the internal models of whisker motion and then extending to object localization result in free-air whisking and object palpation, respectively.

[1]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[2]  D. Kleinfeld,et al.  'Where' and 'what' in the whisker sensorimotor system , 2008, Nature Reviews Neuroscience.

[3]  Merav Ahissar,et al.  Hebbian-like functional plasticity in the auditory cortex of the behaving monkey , 1998, Neuropharmacology.

[4]  Per Magne Knutsen,et al.  Orthogonal coding of object location , 2009, Trends in Neurosciences.

[5]  T. Prescott,et al.  Active touch sensing in the rat: anticipatory and regulatory control of whisker movements during surface exploration. , 2009, Journal of neurophysiology.

[6]  Madan M. Gupta,et al.  An adaptive switching learning control method for trajectory tracking of robot manipulators , 2006 .

[7]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[8]  Eilon Vaadia,et al.  Neural basis of sensorimotor learning: modifying internal models , 2008, Current Opinion in Neurobiology.

[9]  Juyang Weng,et al.  Developmental Robotics: Theory and Experiments , 2004, Int. J. Humanoid Robotics.

[10]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[11]  Daniel N. Hill,et al.  Biomechanics of the Vibrissa Motor Plant in Rat: Rhythmic Whisking Consists of Triphasic Neuromuscular Activity , 2008, The Journal of Neuroscience.

[12]  J. Carmena,et al.  Active Sensing of Target Location Encoded by Cortical Microstimulation , 2011, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[13]  Shalabh Bhatnagar,et al.  Incremental Natural Actor-Critic Algorithms , 2007, NIPS.

[14]  Nuttapong Chentanez,et al.  Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[15]  Joseph H. Solomon,et al.  Variability in velocity profiles during free-air whisking behavior of unrestrained rats. , 2008, Journal of neurophysiology.

[16]  Chien Chern Cheah,et al.  Adaptive Tracking Control for Robots with Unknown Kinematic and Dynamic Properties , 2006, Int. J. Robotics Res..

[17]  Bernhard Schölkopf,et al.  Learning Inverse Dynamics: a Comparison , 2008, ESANN.

[18]  David Kleinfeld,et al.  Closed-loop neuronal computations: focus on vibrissa somatosensation in rat. , 2003, Cerebral cortex.

[19]  E. Ahissar,et al.  Temporal and Spatial Characteristics of Vibrissa Responses to Motor Commands , 2010, The Journal of Neuroscience.

[20]  E. Ahissar,et al.  Encoding of Vibrissal Active Touch , 2003, Neuron.

[21]  Santanu Chaudhury,et al.  Self-organizing neural networks for learning inverse dynamics of robot manipulator , 1995, Proceedings of IEEE/IAS International Conference on Industrial Automation and Control.

[22]  T. Prescott,et al.  The development of whisker control in rats in relation to locomotion. , 2012, Developmental psychobiology.

[23]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[24]  Ehud Ahissar,et al.  Reinforcement active learning hierarchical loops , 2011, The 2011 International Joint Conference on Neural Networks.

[25]  E. Ahissar,et al.  Responses of trigeminal ganglion neurons to the radial distance of contact during active vibrissal touch. , 2006, Journal of neurophysiology.

[26]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[27]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[28]  D M Wolpert,et al.  Multiple paired forward and inverse models for motor control , 1998, Neural Networks.

[29]  D Kleinfeld,et al.  Anatomical loops and their electrical dynamics in relation to whisking by rat. , 1999, Somatosensory & motor research.

[30]  R. Shadmehr,et al.  Internal models and contextual cues: encoding serial order and direction of movement. , 2005, Journal of neurophysiology.

[31]  Mitsuo Kawato,et al.  Internal models for motor control and trajectory planning , 1999, Current Opinion in Neurobiology.

[32]  E. Ahissar,et al.  Acetylcholine-dependent induction and expression of functional plasticity in the barrel cortex of the adult rat. , 2001, Journal of neurophysiology.

[33]  J. Krakauer,et al.  A computational neuroanatomy for motor control , 2008, Experimental Brain Research.

[34]  Andrew G. Barto,et al.  An intrinsic reward mechanism for efficient exploration , 2006, ICML.

[35]  Nathan G. Clack,et al.  Vibrissa-Based Object Localization in Head-Fixed Mice , 2010, The Journal of Neuroscience.

[36]  Ehud Ahissar,et al.  Figuring Space by Time , 2001, Neuron.

[37]  Joseph H. Solomon,et al.  Biomechanical models for radial distance determination by the rat vibrissal system. , 2007, Journal of neurophysiology.

[38]  Per Magne Knutsen,et al.  Haptic Object Localization in the Vibrissal System: Behavior and Performance , 2006, The Journal of Neuroscience.