Learning the Condition of Satisfaction of an Elementary Behavior in Dynamic Field Theory

Abstract In order to proceed along an action sequence, an autonomous agent has to recognize that the intended final condition of the previous action has been achieved. In previous work, we have shown how a sequence of actions can be generated by an embodied agent using a neural-dynamic architecture for behavioral organization, in which each action has an intention and condition of satisfaction. These components are represented by dynamic neural fields, and are coupled to motors and sensors of the robotic agent.Here,we demonstratehowthemappings between intended actions and their resulting conditions may be learned, rather than pre-wired.We use reward-gated associative learning, in which, over many instances of externally validated goal achievement, the conditions that are expected to result with goal achievement are learned. After learning, the external reward is not needed to recognize that the expected outcome has been achieved. This method was implemented, using dynamic neural fields, and tested on a real-world E-Puck mobile robot and a simulated NAO humanoid robot.

[1]  Estela Bicho,et al.  The dynamic neural field approach to cognitive robotics , 2006, Journal of neural engineering.

[2]  Yulia Sandamirskaya,et al.  Neural dynamics of hierarchically organized sequences: A robotic implementation , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[3]  Juyang Weng,et al.  Developmental Robotics: Theory and Experiments , 2004, Int. J. Humanoid Robotics.

[4]  Estela Bicho,et al.  Target Representation on an Autonomous Vehicle with Low-Level Sensors , 2000, Int. J. Robotics Res..

[5]  S. Amari Dynamics of pattern formation in lateral-inhibition type neural fields , 1977, Biological Cybernetics.

[6]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[7]  Gregor Schöner,et al.  A neural-dynamic architecture for behavioral organization of an embodied agent , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[8]  Jeffrey S. Johnson,et al.  Moving to higher ground: The dynamic field theory and the dynamics of visual cognition. , 2008, New ideas in psychology.

[9]  C. L. Hull Principles of behavior : an introduction to behavior theory , 1943 .

[10]  Gregor Schöner,et al.  A Dynamic Field Architecture for the Generation of Hierarchically Organized Sequences , 2012, ICANN.

[11]  B. Balleine,et al.  Motivational control of goal-directed action , 1994 .

[12]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.

[13]  Christian Faubel Fast learning to recognize objects : Dynamic Fields in label-feature spaces , 2006 .

[14]  Stephan K. U. Zibner,et al.  Using Dynamic Field Theory to extend the embodiment stance toward higher cognition , 2013 .

[15]  G. Schöner,et al.  Bridging the representational gap in the dynamic systems approach to development , 2003 .

[16]  J. Cowan,et al.  A mathematical theory of the functional dynamics of cortical and thalamic nervous tissue , 1973, Kybernetik.

[17]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[18]  Martin V. Butz,et al.  The Anticipatory Approach: Definitions and Taxonomies , 2008, The Challenge of Anticipation.

[19]  J. Emlen Determinants of Cliff Edge and Escape Responses in Herring Gull Chicks in Nature1)2) , 1963 .

[20]  Gregor Sch,et al.  Dynamical Systems Approaches to Cognition , 2008 .

[21]  Yulia Sandamirskaya,et al.  Dynamic neural fields as a step toward cognitive neuromorphic architectures , 2014, Front. Neurosci..

[22]  D. Parisi,et al.  TRoPICALS: a computational embodied neuroscience model of compatibility effects. , 2010, Psychological review.

[23]  W. Schultz,et al.  Dopamine signals for reward value and risk: basic and recent data , 2010, Behavioral and Brain Functions.

[24]  Jürgen Schmidhuber,et al.  Autonomous reinforcement of behavioral sequences in neural dynamics , 2012, The 2013 International Joint Conference on Neural Networks (IJCNN).

[25]  R. Rescorla,et al.  Two-process learning theory: Relationships between Pavlovian conditioning and instrumental learning. , 1967, Psychological review.

[26]  Gregor Schöner,et al.  A robotic architecture for action selection and behavioral organization inspired by human cognition , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  W. Schultz Book Review: Reward Signaling by Dopamine Neurons , 2001, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[28]  Gregor Schöner,et al.  An embodied account of serial order: How instabilities drive sequence generation , 2010, Neural Networks.

[29]  J. Searle Intentionality: An Essay in the Philosophy of Mind , 1983 .

[30]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[31]  H. W. C.,et al.  Dynamic Psychology , 1918, Nature.