3D Mean-Shift Tracking of Human Body Parts and Recognition of Working Actions in an Industrial Environment

In this study we describe a method for 3D trajectory based recognition of and discrimination between different working actions in an industrial environment. A motion-attributed 3D point cloud represents the scene based on images of a small-baseline trinocular camera system. A two-stage mean-shift algorithm is used for detection and 3D tracking of all moving objects in the scene. A sequence of working actions is recognised with a particle filter based matching of a non-stationary Hidden Markov Model, relying on spatial context and a classification of the observed 3D trajectories. The system is able to extract an object performing a known action out of a multitude of tracked objects. The 3D tracking stage is evaluated with respect to its metric accuracy based on nine real-world test image sequences for which ground truth data were determined. An experimental evaluation of the action recognition stage is conducted using 20 real-world test sequences acquired from different viewpoints in an industrial working environment. We show that our system is able to perform 3D tracking of human body parts and a subsequent recognition of working actions under difficult, realistic conditions. It detects interruptions of the sequence of working actions by entering a safety mode and returns to the regular mode as soon as the working actions continue.

[1]  Allen R. Hanson,et al.  Computer Vision Systems , 1978 .

[2]  Franz Kummert,et al.  3D Action Recognition in an Industrial Environment , 2009, Human Centered Robot Systems, Cognition, Interaction, Technology.

[3]  Stefan K. Gehrig,et al.  A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching , 2009, ICVS.

[4]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[6]  Daniel Cremers,et al.  An Improved Algorithm for TV-L 1 Optical Flow , 2009, Statistical and Geometrical Approaches to Visual Motion Analysis.

[7]  Gary R. Bradski,et al.  Real time face and object tracking as a component of a perceptual user interface , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[8]  Sven Wachsmuth,et al.  An Object-Oriented Approach Using a Top-Down and Bottom-Up Process for Manipulative Action Recognition , 2006, DAGM-Symposium.

[9]  James W. Davis,et al.  Kernel-Based 3D Tracking , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Michael J. Black,et al.  A Probabilistic Framework for Matching Temporal Trajectories: CONDENSATION-Based Recognition of Gestures and Expressions , 1998, ECCV.

[11]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Christian Wöhler,et al.  Accurate chequerboard corner localisation for camera calibration , 2011, Pattern Recognit. Lett..

[13]  Franz Kummert,et al.  Spatio-temporal 3D Pose Estimation and Tracking of Human Body Parts in an Industrial Environment , 2010 .

[14]  Bernd Neumann,et al.  Computer Vision — ECCV’98 , 1998, Lecture Notes in Computer Science.

[15]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[16]  Daniel Cremers,et al.  Efficient Dense Scene Flow from Sparse or Dense Stereo Data , 2008, ECCV.

[17]  Markus Hahn,et al.  3D Action Recognition and Long-Term Prediction of Human Motion , 2008, ICVS.