Human action categorization using Conditional Random Field

Automatic human action recognition has been a challenging issue in the field of machine vision. Some high-level features such as SIFT, although with promising performance for action recognition, are computationally complex to some extent. To deal with this problem, we construct the features based on the Distance Transform of body contours, which is relatively simple and computationally efficient, to represent human action in the video. After extracting the features from videos, we adopt the Conditional Random Field for modeling the temporal action sequences. The proposed method is tested with an available standard dataset. We also testify the robustness of our method on various realistic conditions, such as body occlusion or intersection.

[1]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[4]  David G. Kirkpatrick,et al.  Linear Time Euclidean Distance Algorithms , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[6]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[9]  R. Sukthankar,et al.  Space-Time Shapelets for Action Recognition , 2008, 2008 IEEE Workshop on Motion and video Computing.

[10]  Cristian Sminchisescu,et al.  Conditional models for contextual human motion recognition , 2006, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Liang Wang,et al.  Learning and Matching of Dynamic Shape Manifolds for Human Action Recognition , 2007, IEEE Transactions on Image Processing.

[13]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[14]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[15]  Qiang Wu,et al.  Human Action Recognition by Radon Transform , 2008, 2008 IEEE International Conference on Data Mining Workshops.