Goal-oriented behavior sequence generation based on semantic commands using multiple timescales recurrent neural network with initial state correction

In this paper, to build an autonomous robot, we propose a novel scheme for a goal-oriented behavior sequence generation in tasks involving multiple objects. The scheme includes three major functions: (1) visual attention for target object localization; (2) automatic initial state correction based on experience using simple reinforcement learning, and (3) a suitable behavior sequence generation method based on multiple timescales recurrent neural networks (MTRNN). The proposed scheme systematically combines the three different major functions so that the autonomous bi-pad robot can automatically execute tasks involving multiple objects based on high level semantic commands given by human supervisor. The selective attention model continuously catches the visual environment to understand the current states of robot and perceive the relationship between current states of robot and the environment (depth perception and localization of a target object). If the current state is different from the initial state (depth perception and localization of a target object), the robot automatically adjust its current state to the initial state by integrating visual attention and simple reinforcement learning. After correcting the initial state of the robot, the behavior sequence generation functions can successfully generate suitable behavior timing signals, by integrating visual attention and MTRNN, based on the high level semantic commands given by human supervisor. Experimental results show that the proposed scheme can successfully generate suitable behavior timing, for a robot to autonomously achieve the tasks involving multiple objects, such as searching, approaching and hitting the target object using its arm.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[3]  Pierre Blazevic,et al.  The NAO humanoid: a combination of performance and affordability , 2008, ArXiv.

[4]  D H Ballard,et al.  Hand-eye coordination during sequential tasks. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[5]  Minho Lee,et al.  Goal-Oriented Behavior Generation for Visually-Guided Manipulation Task , 2011, ICONIP.

[6]  Minho Lee,et al.  Biologically motivated vergence control system using human-like selective attention model , 2006, Neurocomputing.

[7]  R. Johansson,et al.  Eye–Hand Coordination in Object Manipulation , 2001, The Journal of Neuroscience.

[8]  J Saarinen,et al.  Self-Organized Formation of Colour Maps in a Model Cortex , 1985, Perception.

[9]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[10]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[11]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[12]  Jun Tani,et al.  Development of hierarchical structures for actions and motor imagery: a constructivist view from synthetic neuro-robotics study , 2009, Psychological research.

[13]  Howard Poizner,et al.  The interaction of visual and proprioceptive inputs in pointing to actual and remembered targets , 2004, Experimental Brain Research.

[14]  Minho Lee,et al.  Stereo saliency map considering affective factors and selective motion analysis in a dynamic environment , 2008, Neural Networks.

[15]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[16]  Minho Lee,et al.  Developmental learning of integrating visual attention shifts and bimanual object grasping and manipulation tasks , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[17]  Minho Lee,et al.  Neuro-robotics study on integrative learning of proactive visual attention and motor behaviors , 2012, Cognitive Neurodynamics.

[18]  Dominique Martinez,et al.  A Model of Stimulus-Specific Neural Assemblies in the Insect Antennal Lobe , 2008, PLoS Comput. Biol..

[19]  K. Doya,et al.  Memorizing oscillatory patterns in the analog neuron network , 1989, International 1989 Joint Conference on Neural Networks.