Group context learning for event recognition

We address the problem of group-level event recognition from videos. The events of interest are defined based on the motion and interaction of members in a group over time. Example events include group formation, dispersion, following, chasing, flanking, and fighting. To recognize these complex group events, we propose a novel approach that learns the group-level scenario context from automatically extracted individual trajectories. We first perform a group structure analysis to produce a weighted graph that represents the probabilistic group membership of the individuals. We then extract features from this graph to capture the motion and action contexts among the groups. The features are represented using the “bag-of-words” scheme. Finally, our method uses the learned Support Vector Machine (SVM) to classify a video segment into the six event categories. Our implementation builds upon a mature multi-camera multi-target tracking system that recognizes the group-level events involving up to 20 individuals in real-time.

[1]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[2]  Luc Van Gool,et al.  Exploiting simple hierarchies for unsupervised human behavior analysis , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Silvio Savarese,et al.  Cross-view action recognition via view knowledge transfer , 2011, CVPR 2011.

[4]  Silvio Savarese,et al.  What are they doing? : Collective activity classification using spatio-temporal relationship among people , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[5]  Ramin Mehran,et al.  Abnormal crowd behavior detection using social force model , 2009, CVPR.

[6]  Ying Wu,et al.  Distributed data association and filtering for multiple target tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Elisa Ricci,et al.  Earth mover's prototypes: A convex learning approach for discovering activity patterns in dynamic scenes , 2011, CVPR 2011.

[8]  Shaogang Gong,et al.  A Markov Clustering Topic Model for mining behaviour in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Amit K. Roy-Chowdhury,et al.  A “string of feature graphs” model for recognition of complex activities in natural videos , 2011, 2011 International Conference on Computer Vision.

[11]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  William Brendel,et al.  Learning spatiotemporal graphs of human activities , 2011, 2011 International Conference on Computer Vision.

[13]  François Brémond,et al.  Crowd Behavior Recognition for Video Surveillance , 2008, ACIVS.

[14]  Shaogang Gong,et al.  Scene Segmentation for Behaviour Correlation , 2008, ECCV.

[15]  Junsong Yuan,et al.  Optimal spatio-temporal path discovery for video event detection , 2011, CVPR 2011.

[16]  Mubarak Shah,et al.  Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Luc Van Gool,et al.  What's going on? Discovering spatio-temporal dependencies in dynamic scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Alan Fern,et al.  Probabilistic event logic for interval-based event recognition , 2011, CVPR 2011.

[19]  Martial Hebert,et al.  Event Detection in Crowded Videos , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20]  Subhransu Maji,et al.  Action recognition from a distributed representation of pose and appearance , 2011, CVPR 2011.

[21]  Mubarak Shah,et al.  Learning, detection and representation of multi-agent events in videos , 2007, Artif. Intell..

[22]  Bohyung Han,et al.  Scenario-based video event recognition by constraint flow , 2011, CVPR 2011.

[23]  Silvio Savarese,et al.  Learning context for collective activity recognition , 2011, CVPR 2011.

[24]  Mubarak Shah,et al.  Learning object motion patterns for anomaly detection and improved object detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[26]  Larry S. Davis,et al.  Multi-agent event recognition in structured scenarios , 2011, CVPR 2011.

[27]  Luc Van Gool,et al.  A Hough transform-based voting framework for action recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Ming-Ching Chang,et al.  Probabilistic group-level motion analysis and scenario recognition , 2011, 2011 International Conference on Computer Vision.

[29]  Mubarak Shah,et al.  Probabilistic Modeling of Scene Dynamics for Applications in Visual Surveillance , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Ivor W. Tsang,et al.  Visual Event Recognition in Videos by Learning from Web Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jiebo Luo,et al.  Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.