Generation of sensory reflex behavior versus intentional proactive behavior in robot learning of cooperative interactions with others

This paper investigates the essential difference between two types of behavior generation schemes, namely, sensory reflex behavior generation and intentional proactive behavior generation, by proposing a dynamic neural network model referred to as stochastic multiple-timescale recurrent neural network (S-MTRNN). The proposed model was employed in an experiment involving robots learning to cooperate with others under the condition of potential unpredictability of the others' behaviors. The results of the learning experiment showed that sensory reflex behavior was generated by a self-organizing probabilistic prediction mechanism when the initial sensitivity characteristics in the network dynamics were not utilized in the learning process. In contrast, proactive behavior with a deterministic prediction mechanism was developed when the initial sensitivity was utilized. It was further shown that in situations where unexpected behaviors of others were observed, the behavioral context was re-situated by adaptation of the internal neural dynamics by means of simple sensory reflexes in the former case. In the latter case, the behavioral context was re-situated by error regression of the internal neural activity rather than by sensory reflex. The role of the top-down and bottom-up interactions in dealing with unexpected situations is discussed.

[1]  Karl J. Friston,et al.  Reinforcement Learning or Active Inference? , 2009, PloS one.

[2]  E. Gibson,et al.  An Ecological Approach to Perceptual Learning and Development , 2000 .

[3]  Hiroaki Arie,et al.  Synthetic Approach to Understanding Meta-level Cognition of Predictability in Generating Cooperative Behavior , 2013 .

[4]  Jun Tani,et al.  A Neurodynamic Account of Spontaneous Behaviour , 2011, PLoS Comput. Biol..

[5]  Jun Tani,et al.  Dynamic and interactive generation of object handling behaviors by a small humanoid robot using a dynamic neural network model , 2006, Neural Networks.

[6]  Jun Tani,et al.  Learning to generate articulated behavior through the bottom-up and the top-down interaction processes , 2003, Neural Networks.

[7]  Heinrich H. Bülthoff,et al.  The quick and the dead: when reaction beats intention , 2010, Proceedings of the Royal Society B: Biological Sciences.

[8]  Shigeki Sugano,et al.  Learning to Reproduce Fluctuating Time Series by Inferring Their Time-Dependent Stochastic Properties: Application in Robot Learning Via Tutoring , 2013, IEEE Transactions on Autonomous Mental Development.

[9]  Karl J. Friston,et al.  Action understanding and active inference , 2011, Biological Cybernetics.

[10]  Dirk Kerzel,et al.  Visually guided movements to color targets , 2006, Experimental Brain Research.

[11]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[12]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[13]  Rolf Pfeifer,et al.  Sensory - motor coordination: The metaphor and beyond , 1997, Robotics Auton. Syst..

[14]  Stanley J. Rosenschein,et al.  A dynamical systems perspective on agent-environment interaction , 1996 .

[15]  M. Inase,et al.  Neuronal activity in the primate premotor, supplementary, and precentral motor cortex during visually guided and internally determined sequential movements. , 1991, Journal of neurophysiology.

[16]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..