Where Are They Going? Predicting Human Behaviors in Crowded Scenes

In this article, we propose a framework for crowd behavior prediction in complicated scenarios. The fundamental framework is designed using the standard encoder-decoder scheme, which is built upon the long short-term memory module to capture the temporal evolution of crowd behaviors. To model interactions among humans and environments, we embed both the social and the physical attention mechanisms into the long short-term memory. The social attention component can model the interactions among different pedestrians, whereas the physical attention component helps to understand the spatial configurations of the scene. Since pedestrians’ behaviors demonstrate multi-modal properties, we use the generative model to produce multiple acceptable future paths. The proposed framework not only predicts an individual’s trajectory accurately but also forecasts the ongoing group behaviors by leveraging on the coherent filtering approach. Experiments are carried out on the standard crowd benchmarks (namely, the ETH, the UCY, the CUHK crowd, and the CrowdFlow datasets), which demonstrate that the proposed framework is effective in forecasting crowd behaviors in complex scenarios.

[1]  Silvio Savarese,et al.  Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes , 2016, ECCV.

[2]  Xiaogang Wang,et al.  Coherent Filtering: Detecting Coherent Motions from Crowd Clutters , 2012, ECCV.

[3]  Mohan M. Trivedi,et al.  Trajectory Learning for Activity Understanding: Unsupervised, Multilevel, and Long-Term Adaptive Approach , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Sridha Sridharan,et al.  Soft + Hardwired Attention: An LSTM Framework for Human Trajectory Prediction and Abnormal Event Detection , 2017, Neural Networks.

[5]  Xiaogang Wang,et al.  Learning Collective Crowd Behaviors with Dynamic Pedestrian-Agents , 2014, International Journal of Computer Vision.

[6]  Chee Seng Chan,et al.  Crowd behavior analysis: A review where physics meets biology , 2015, Neurocomputing.

[7]  Dinesh Manocha,et al.  Reciprocal Velocity Obstacles for real-time multi-agent navigation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[8]  Fei-Fei Li,et al.  Socially-Aware Large-Scale Crowd Forecasting , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Philip H. S. Torr,et al.  DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Nuno Vasconcelos,et al.  Anomaly Detection and Localization in Crowded Scenes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Bingbing Ni,et al.  Crowded Scene Analysis: A Survey , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Wentong Cai,et al.  Learning Behavior Patterns from Video: A Data-driven Framework for Agent-based Crowd Modeling , 2015, AAMAS.

[13]  Silvio Savarese,et al.  Social Scene Understanding: End-to-End Multi-person Action Localization and Collective Activity Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Andrea Cavallaro,et al.  Support Vector Motion Clustering , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Xiaogang Wang,et al.  Understanding pedestrian behaviors from stationary crowd groups , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Tal Hassner,et al.  Violent flows: Real-time detection of violent crowd behavior , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[17]  Dariu M. Gavrila,et al.  Human motion trajectory prediction: a survey , 2019, Int. J. Robotics Res..

[18]  Martial Hebert,et al.  Activity Forecasting , 2012, ECCV.

[19]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Silvio Savarese,et al.  Knowledge Transfer for Scene-Specific Motion Prediction , 2016, ECCV.

[21]  Silvio Savarese,et al.  Forecasting Social Navigation in Crowded Complex Scenes , 2016, ArXiv.

[22]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[23]  Robert T. Collins,et al.  Vision-Based Analysis of Small Groups in Pedestrian Crowds , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Mubarak Shah,et al.  Identifying Behaviors in Crowd Scenes Using Stability Analysis for Dynamical Systems , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Alberto Del Bimbo,et al.  Context-Aware Trajectory Prediction , 2017, 2018 24th International Conference on Pattern Recognition (ICPR).

[26]  Xiaogang Wang,et al.  Learning Scene-Independent Group Descriptors for Crowd Understanding , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[27]  Xiaogang Wang,et al.  Pedestrian Behavior Understanding and Prediction with Deep Neural Networks , 2016, ECCV.

[28]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[29]  Tobias Senst,et al.  Optical Flow Dataset and Benchmark for Visual Crowd Analysis , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[30]  Carol O'Sullivan,et al.  Path patterns: analyzing and comparing real and simulated crowds , 2016, I3D.

[31]  Jean Oh,et al.  Social Attention: Modeling Attention in Human Crowds , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Silvio Savarese,et al.  SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Mubarak Shah,et al.  A Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Nicolas Courty,et al.  Crowd Flow Characterization with Optimal Control Theory , 2009, ACCV.