Off-line simulation inspires insight: A neurodynamics approach to efficient robot task learning

There is currently an increasing demand for robots able to acquire the sequential organization of tasks from social learning interactions with ordinary people. Interactive learning-by-demonstration and communication is a promising research topic in current robotics research. However, the efficient acquisition of generalized task representations that allow the robot to adapt to different users and contexts is a major challenge. In this paper, we present a dynamic neural field (DNF) model that is inspired by the hypothesis that the nervous system uses the off-line re-activation of initial memory traces to incrementally incorporate new information into structured knowledge. To achieve this, the model combines fast activation-based learning to robustly represent sequential information from single task demonstrations with slower, weight-based learning during internal simulations to establish longer-term associations between neural populations representing individual subtasks. The efficiency of the learning process is tested in an assembly paradigm in which the humanoid robot ARoS learns to construct a toy vehicle from its parts. User demonstrations with different serial orders together with the correction of initial prediction errors allow the robot to acquire generalized task knowledge about possible serial orders and the longer term dependencies between subgoals in very few social learning interactions. This success is shown in a joint action scenario in which ARoS uses the newly acquired assembly plan to construct the toy together with a human partner.

[1]  K. Dautenhahn,et al.  Imitation in Animals and Artifacts , 2002 .

[2]  Michael X. Cohen,et al.  Working Memory Maintenance Contributes to Long-term Memory Formation: Neural and Behavioral Evidence , 2005, Journal of Cognitive Neuroscience.

[3]  Estela Bicho,et al.  The dynamic neural field approach to cognitive robotics , 2006, Journal of neural engineering.

[4]  Monica N. Nicolescu,et al.  Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[5]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[6]  Katsushi Ikeuchi,et al.  Toward an assembly plan from observation. I. Task recognition with polyhedral objects , 1994, IEEE Trans. Robotics Autom..

[7]  Chrystopher L. Nehaniv,et al.  Teaching robot companions: the role of scaffolding and event structuring , 2008, Connect. Sci..

[8]  Stephen Coombes,et al.  Exotic dynamics in a firing rate model of neural tissue with threshold accommodation , 2007 .

[9]  E. Miller,et al.  THE PREFRONTAL CORTEX AND COGNITIVE CONTROL , 2000 .

[10]  E. Miller,et al.  The prefontral cortex and cognitive control , 2000, Nature Reviews Neuroscience.

[11]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[12]  Daniel Bullock,et al.  Learning and production of movement sequences: behavioral, neurophysiological, and modeling perspectives. , 2004, Human movement science.

[13]  J. Tanji Sequential organization of multiple movements: involvement of cortical motor areas. , 2001, Annual review of neuroscience.

[14]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[15]  Estela Bicho,et al.  Towards human-like bimanual movements in anthropomorphic robots: a nonlinear optimization approach , 2015 .

[16]  Andrea Lockerd Thomaz,et al.  Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..

[17]  Gene E. McClellan,et al.  Geometric calculus-based postulates for the derivation and extension of the Maxwell equations , 2012 .

[18]  Emilio Salinas,et al.  Background Synaptic Activity as a Switch Between Dynamical States in a Network , 2003, Neural Computation.

[19]  S. Grossberg Behavioral Contrast in Short Term Memory: Serial Binary Memory Models or Parallel Continuous Memory Models? , 1978 .

[20]  Aaron R. Seitz,et al.  A common framework for perceptual learning , 2007, Current Opinion in Neurobiology.

[21]  Gerhard Lakemeyer,et al.  Cognitive Robotics , 2008, Handbook of Knowledge Representation.

[22]  Estela Bicho,et al.  Learning a musical sequence by observation: A robotics implementation of a dynamic neural field model , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[23]  Gregor Schöner,et al.  An embodied account of serial order: How instabilities drive sequence generation , 2010, Neural Networks.

[24]  G. Schöner The Cambridge Handbook of Computational Psychology: Dynamical Systems Approaches to Cognition , 2008 .

[25]  Yiannis Demiris,et al.  Towards One Shot Learning by imitation for humanoid robots , 2010, 2010 IEEE International Conference on Robotics and Automation.

[26]  James L. McClelland,et al.  Learning the structure of event sequences. , 1991, Journal of experimental psychology. General.

[27]  Estela Bicho,et al.  Integrating Verbal and Nonverbal Communication in a Dynamic Neural Field Architecture for Human–Robot Interaction , 2010, Front. Neurorobot..

[28]  R. Stickgold Sleep-dependent memory consolidation , 2005, Nature.

[29]  J. Tanji,et al.  Representation of the temporal order of visual objects in the primate lateral prefrontal cortex. , 2003, Journal of neurophysiology.

[30]  Jan Born,et al.  Implicit Learning–Explicit Knowing: A Role for Sleep in Memory System Interaction , 2006, Journal of Cognitive Neuroscience.

[31]  Boris S. Gutkin,et al.  Multiple Bumps in a Neuronal Model of Working Memory , 2002, SIAM J. Appl. Math..

[32]  S. Thompson Social Learning Theory , 2008 .

[33]  R. O’Reilly,et al.  Opinion TRENDS in Cognitive Sciences Vol.6 No.12 December 2002 , 2022 .

[34]  R. Ivry,et al.  The neural representation of time , 2004, Current Opinion in Neurobiology.

[35]  Hooman Samani,et al.  Cognitive Robotics , 2015, MIWAI.

[36]  Daeyeol Lee,et al.  Beyond working memory: the role of persistent activity in decision making , 2010, Trends in Cognitive Sciences.

[37]  D. R. Euston,et al.  Fast-Forward Playback of Recent Memory Sequences in Prefrontal Cortex During Sleep , 2007, Science.

[38]  Alvaro Pascual-Leone,et al.  Current concepts in procedural consolidation , 2004, Nature Reviews Neuroscience.

[39]  Estela Bicho,et al.  Neuro-cognitive mechanisms of decision making in joint action: a human-robot interaction study. , 2011, Human movement science.

[40]  B. McNaughton,et al.  Memory trace reactivation in hippocampal and neocortical neuronal ensembles , 2000, Current Opinion in Neurobiology.

[41]  J. Csicsvari,et al.  Replay and Time Compression of Recurring Spike Sequences in the Hippocampus , 1999, The Journal of Neuroscience.

[42]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[43]  Aldo Genovesio,et al.  Representation of Future and Previous Spatial Goals by Separate Neural Populations in Prefrontal Cortex , 2006, The Journal of Neuroscience.

[44]  S. Amari Dynamics of pattern formation in lateral-inhibition type neural fields , 1977, Biological Cybernetics.

[45]  Rüdiger Dillmann,et al.  Incremental Learning of Tasks From User Demonstrations, Past Experiences, and Vocal Comments , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  S. Ben Hamed,et al.  Proactive inhibitory control varies with task context , 2012, The European journal of neuroscience.

[47]  G. Schöner,et al.  Dynamic Field Theory of Movement Preparation , 2022 .

[48]  Stefan Schaal,et al.  The New Robotics—towards Human-centered Machines , 2007 .

[49]  H. Eichenbaum,et al.  Interplay of Hippocampus and Prefrontal Cortex in Memory , 2013, Current Biology.