Continuous Motion Recognition Using Multiple Time Constant Recurrent Neural Network with a Deep Network Model

Multiple timescale recurrent neural network MTRNN model is a useful tool to model a continuous signal for a dynamic task such as human action recognition. Different setting of initial states in the MTRNN brings us convenience to predict multiple signals using the same network model. On the contrary, optimal switching for suitable initial states in the slow context unit of the MTRNN becomes critical condition to achieve desired multiple dynamic tasks. In this paper, we propose a hybrid neural network model combining the MTRNN with a deep learning neural network DN, which is to overcome the problem related to the initial state setting in the MTRNN. The DN together with MTRNN generates a suitable initial state for the slow context units in the MTRNN according to automatically detected situation change. We apply our approach to 20 motion skeleton units, which is obtained by KINECT, to construct three kinds of human motion sequences. The results show that the proposed method is able to recognize various motions using proper initial state information in a real-time procedure.

[1]  A. Kendon Conducting Interaction: Patterns of Behavior in Focused Encounters , 1990 .

[2]  Kenji Doya,et al.  Adaptive neural oscillator using continuous-time back-propagation learning , 1989, Neural Networks.

[3]  Sven Behnke,et al.  Hierarchical Neural Networks for Image Interpretation (Lecture Notes in Computer Science) , 2003 .

[4]  Wei Wei,et al.  Vision-Based Human Motion Recognition: A Survey , 2009, 2009 Second International Conference on Intelligent Networks and Intelligent Systems.

[5]  Luiz Velho,et al.  Kinect and RGBD Images: Challenges and Applications , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images Tutorials.

[6]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[7]  Sven Behnke,et al.  Hierarchical Neural Networks for Image Interpretation , 2003, Lecture Notes in Computer Science.

[8]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[9]  Abdesselam Bouzerdoum,et al.  Adaptive Autoregressive Logarithmic Search for 3D Human Tracking , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[10]  Michael Kerckhove,et al.  Scale-Space and Morphology in Computer Vision , 2001, Lecture Notes in Computer Science 2106.

[11]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[12]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[13]  Ivan Laptev,et al.  Tracking of Multi-state Hand Models Using Particle Filtering and a Hierarchy of Multi-scale Image Features , 2001, Scale-Space.

[14]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[15]  Qing Chen,et al.  Dynamic Gesture Recognition , 2005, 2005 IEEE Instrumentationand Measurement Technology Conference Proceedings.

[16]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[17]  Tetsuya Ogata,et al.  Emergence of hierarchical structure mirroring linguistic composition in a recurrent neural network , 2011, Neural Networks.

[18]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..