Robust Workflow Recognition Using Holistic Features and Outlier-Tolerant Fused Hidden Markov Models

Monitoring real world environments such as industrial scenes is a challenging task due to heavy occlusions, resemblance of different processes, frequent illumination changes, etc. We propose a robust framework for recognizing workflows in such complex environments, boasting a threefold contribution: Firstly, we employ a novel holistic scene descriptor to efficiently and robustly model complex scenes, thus bypassing the very challenging tasks of target recognition and tracking. Secondly, we handle the problem of limited visibility and occlusions by exploiting redundancies through the use of merged information from multiple cameras. Finally, we use the multivariate Student-t distribution as the observation likelihood of the employed Hidden Markov Models, in order to further enhance robustness.We evaluate the performance of the examined approaches under real-life visual behavior understanding scenarios and we compare and discuss the obtained results.

[1]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[3]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[4]  Dan Schonfeld,et al.  HMM-based motion recognition system using segmented PCA , 2005, IEEE International Conference on Image Processing 2005.

[5]  Jitendra Malik,et al.  Learning Appearance Based Models: Mixtures of Second Moment Experts , 1996, NIPS.

[6]  Shaogang Gong,et al.  Beyond Tracking: Modelling Activity and Understanding Behaviour , 2006, International Journal of Computer Vision.

[7]  Dimitris N. Metaxas,et al.  Parallel hidden Markov models for American sign language recognition , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Sotirios Chatzis,et al.  Robust Sequential Data Modeling Using an Outlier Tolerant Hidden Markov Model , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Zhihong Zeng,et al.  Audio–Visual Affective Expression Recognition Through Multistream Fused HMM , 2008, IEEE Transactions on Multimedia.

[10]  Peter H. N. de With,et al.  Automatic video-based human motion analyzer for consumer surveillance system , 2009, IEEE Transactions on Consumer Electronics.

[11]  Juergen Luettin,et al.  Audio-Visual Speech Modeling for Continuous Speech Recognition , 2000, IEEE Trans. Multim..

[12]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Stavros J. Perantonis,et al.  Detecting abnormal human behaviour using multiple cameras , 2009, Signal Process..

[15]  Luc Van Gool,et al.  Exploring context to learn scene specific object detectors , 2009 .

[16]  Patrick Pérez,et al.  Retrieving actions in movies , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Lihi Zelnik-Manor,et al.  Statistical analysis of dynamic actions , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.