3D Action Recognition in an Industrial Environment

In this study we introduce a method for 3D trajectory based recognition of and discrimination between different working actions. The 3D pose of the human hand-forearm limb is tracked over time with a two-hypothesis tracking framework based on the Shape Flow algorithm. A sequence of working actions is recognised with a particle filter based non-stationary Hidden Markov Model framework, relying on the spatial context and a classification of the observed 3D trajectories using the Levenshtein Distance on Trajectories as a measure for the similarity between the observed trajectories and a set of reference trajectories. An experimental evaluation is performed on 20 real-world test sequences acquired from different viewpoints in an industrial working environment. The action-specific recognition rates of our system correspond to more than 90%. The actions are recognised with a delay of typically some tenths of a second. Our system is able to detect disturbances, i.e. interruptions of the sequence of working actions, by entering a safety mode, and it returns to the regular mode as soon as the working actions continue.

[1]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[3]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[4]  Markus Hahn,et al.  3D Action Recognition and Long-Term Prediction of Human Motion , 2008, ICVS.

[5]  G. McLachlan,et al.  Pattern Classification: A Unified View of Statistical and Neural Approaches. , 1998 .

[6]  Jürgen Schürmann,et al.  Pattern classification , 2008 .

[7]  S. Kearsley On the orthogonal transformation used for structural comparisons , 1989 .

[8]  Sven Wachsmuth,et al.  An Object-Oriented Approach Using a Top-Down and Bottom-Up Process for Manipulative Action Recognition , 2006, DAGM-Symposium.

[9]  Anthony Stefanidis,et al.  3D trajectory matching by pose normalization , 2005, GIS '05.

[10]  Nils Hofemann Videobasierte Handlungserkennung für die natürliche Mensch-Maschine-Interaktion , 2006 .

[11]  Jake K. Aggarwal,et al.  A hierarchical Bayesian network for event recognition of human actions and interactions , 2004, Multimedia Systems.

[12]  Michael J. Black,et al.  A Probabilistic Framework for Matching Temporal Trajectories: CONDENSATION-Based Recognition of Gestures and Expressions , 1998, ECCV.

[13]  Bernd Neumann,et al.  Computer Vision — ECCV’98 , 1998, Lecture Notes in Computer Science.

[14]  Jannik Fritsch,et al.  Combining sensory and symbolic data for manipulative gesture recognition , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[15]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[16]  Allen R. Hanson,et al.  Computer Vision Systems , 1978 .

[17]  Alex Pentland,et al.  Invariant features for 3-D gesture recognition , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[18]  Markus Hahn,et al.  Spatio-temporal 3D pose estimation and tracking of human body parts using the Shape Flow algorithm , 2008, 2008 19th International Conference on Pattern Recognition.