Compressed Domain Real-time Action Recognition

We present a compressed domain scheme that is able to recognize and localize actions in real-time. The recognition problem is posed as performing a video query on a test video sequence. Our method is based on computing motion similarity using compressed domain features which can be extracted with low complexity. We introduce a novel motion correlation measure that takes into account differences in motion magnitudes. Our method is appearance invariant, requires no prior segmentation, alignment or stabilization, and is able to localize actions in both space and time. We evaluated our method on a large action video database consisting of 6 actions performed by 25 people under 3 different scenarios. Our classification results compare favorably with existing methods at only a fraction of their computational cost

[1]  James W. Davis,et al.  The Representation and Recognition of Action Using Temporal Templates , 1997, CVPR 1997.

[2]  Bo Shen,et al.  Compressed-Domain Video Processing , 2002 .

[3]  Wayne H. Wolf,et al.  Human activity detection in MPEG sequences , 2000, Proceedings Workshop on Human Motion.

[4]  Miguel Tavares Coimbra,et al.  Approximating optical flow within the MPEG-2 compressed domain , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Eli Shechtman,et al.  Space-time behavior based correlation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  R. Venkatesh Babu,et al.  Compressed domain action classification using HMM , 2002, Pattern Recognit. Lett..

[9]  R. Venkatesh Babu,et al.  Compressed domain human motion recognition using motion history information , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  Shih-Fu Chang,et al.  Compressed-domain techniques for image/video indexing and manipulation , 1995, Proceedings., International Conference on Image Processing.

[11]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..