Recognizing coordinated multi-object activities using a dynamic event ensemble model

While video-based activity analysis and recognition has received broad attention, existing body of work mostly deals with single object/person case. Modeling involving multiple objects and recognition of coordinated group activities, present in a variety of applications such as surveillance, sports, biological records, and so on, is the main focus of this paper. Unlike earlier attempts which model the complex spatial temporal constraints among different activities of multiple objects with a parametric Bayesian network, we propose a dynamic ‘event ensemble’ framework as a data-driven strategy to characterize the group motion pattern without employing any specific domain knowledge. In particular, we exploit the Riemannian geometric property of the set of ensemble description functions and develop a compact representation for group activities on the ensemble manifold. An appropriate classifier on the manifold is then designed for recognizing new activities. Experiments on football play recognition demonstrate the effectiveness of the framework.

[1]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[2]  Jake K. Aggarwal,et al.  Human motion analysis: a review , 1997, Proceedings IEEE Nonrigid and Articulated Motion Workshop.

[3]  Shaogang Gong,et al.  Recognition of group activities using dynamic probabilistic networks , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Xavier Pennec,et al.  Intrinsic Statistics on Riemannian Manifolds: Basic Tools for Geometric Measurements , 2006, Journal of Mathematical Imaging and Vision.

[5]  Marcello Pelillo,et al.  Dominant Sets and Pairwise Clustering , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  E. Nadaraya On Estimating Regression , 1964 .

[7]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[8]  Rama Chellappa,et al.  "Shape Activity": a continuous-state HMM for moving/deforming shapes with application to abnormal activity detection , 2005, IEEE Transactions on Image Processing.

[9]  Marcello Pelillo,et al.  Dominant Sets and Pairwise Clustering , 2007 .

[10]  Mubarak Shah,et al.  Detecting group activities using rigidity of formation , 2005, MULTIMEDIA '05.

[11]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[12]  Mubarak Shah,et al.  Learning, detection and representation of multi-agent events in videos , 2007, Artif. Intell..

[13]  Aaron F. Bobick,et al.  Recognizing Planned, Multiperson Action , 2001, Comput. Vis. Image Underst..

[14]  W. Eric L. Grimson,et al.  Learning Semantic Scene Models by Trajectory Analysis , 2006, ECCV.

[15]  Ramakant Nevatia,et al.  Multi-agent event recognition , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[16]  Xiaohui Liu,et al.  Multi-agent activity recognition using observation decomposedhidden Markov models , 2006, Image Vis. Comput..

[17]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Anuj Srivastava,et al.  Riemannian Analysis of Probability Density Functions with Applications in Vision , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.