Successive Convex Matching for Action Detection

We propose human action detection based on a successive convex matching scheme. Human actions are represented as sequences of postures and specific actions are detected in video by matching the time-coupled posture sequences to video frames. The template sequence to video registration is formulated as an optimal matching problem. Instead of directly solving the highly non-convex problem, our method convexifies the matching problem into linear programs and refines the matching result by successively shrinking the trust region. The proposed scheme represents the target point space with small sets of basis points and therefore allows efficient searching. This matching scheme is applied to robustly matching a sequence of coupled binary templates simultaneously in a video sequence with cluttered backgrounds.

[1]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[2]  Cordelia Schmid,et al.  Learning to Parse Pictures of People , 2002, ECCV.

[3]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[4]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  William T. Freeman,et al.  On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs , 2001, IEEE Trans. Inf. Theory.

[6]  Eli Shechtman,et al.  Space-time behavior based correlation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Bruno Herbelin,et al.  Body Gesture Recognition and Action Response , 2006 .

[8]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  J. Sullivan,et al.  Action Recognition by Shape Matching to Key Frames , 2002 .

[10]  Jitendra Malik,et al.  Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[12]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  Daniel P. Huttenlocher,et al.  Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[14]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  Jitendra Malik,et al.  Estimating Human Body Configurations Using Shape Context Matching , 2002, ECCV.