Gate and common pathway detection in crowd scenes and anomaly detection using motion units and LSTM predictive models

In this paper, we propose two approaches to analyze the crowd scenes. The first one is motion units and meta-tracking based approach (MUDAM Approach). In this approach, the scene is divided into a number of dynamic divisions with coherent motion dynamics called the motion units (MUs). By analyzing the relationships between these MUs using a proposed continuation likelihood, the scene entrance and exit gates are retrieved. A meta-tracking procedure is then applied and the scene dominant motion pathways are retrieved. To overcome the limitations of the MUDAM approach, and detect some of the anomalies, that may happen in these scenes, we proposed another new LSTM based approach. In this approach, the scene is divided into a number of static overlapped spatial regions named super regions (SRs), which cover the whole scene. Long Short Term Memory (LSTM) is used in defining a predictive model for each of the scene SRs. Each LSTM predictive model uses its SR tracklets in the training, such that, it can capture the whole motion dynamics of that SR. Using apriori known scene entrance segments, the proposed LSTM predictive models are applied and the scene dominant motion pathways are retrieved. an anomaly metric is formulated to be used with the LSTM predictive models to detect the scene anomalies. Prototypes of our proposed approaches were developed and evaluated on the challenging New York Grand Central station scene, in addition to four other crowded scenes. Four types of anomalies that may happen in the crowded scenes were defined in the context, and our proposed LSTM based approach was used in detecting these anomalies. Experimental results on anomalies detection were applied too on a number of data sets. Ov erall, the proposed approaches managed to outperform the state of the art methods in retrieving the scene gates and common pathways, in addition to detecting motion anomalies.

[1]  Liu Yuncai,et al.  Analyzing motion patterns in crowded scenes via automatic tracklets clustering , 2013, China Communications.

[2]  Xiaogang Wang,et al.  Scene-Independent Group Profiling in Crowd , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Xiaogang Wang,et al.  Random field topic model for semantic region analysis in crowded scenes from tracklets , 2011, CVPR 2011.

[4]  Mubarak Shah,et al.  Floor Fields for Tracking in High Density Crowd Scenes , 2008, ECCV.

[5]  Yong Du,et al.  Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Hua Yang,et al.  The Large-Scale Crowd Behavior Perception Based on Spatio-Temporal Viscous Fluid Field , 2013, IEEE Transactions on Information Forensics and Security.

[7]  Junsong Yuan,et al.  Abnormal event detection in crowded scenes using sparse representation , 2013, Pattern Recognit..

[8]  Ramakant Nevatia,et al.  Multi-target tracking by on-line learned discriminative appearance models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Ke Chen,et al.  Pedestrian Density Analysis in Public Scenes With Spatiotemporal Tensor Features , 2016, IEEE Transactions on Intelligent Transportation Systems.

[10]  Xiaofei Wang,et al.  A high accuracy flow segmentation method in crowded scenes based on streakline , 2014 .

[11]  Mark Reynolds,et al.  Bi-Prediction: Pedestrian Trajectory Prediction Based on Bidirectional LSTM Classification , 2017, 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[12]  Ko Nishino,et al.  Tracking Pedestrians Using Local Spatio-Temporal Motion Patterns in Extremely Crowded Scenes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Xu Zhao,et al.  Detect coherent motions in crowd scenes based on tracklets association , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[14]  Ramin Mehran,et al.  Abnormal crowd behavior detection using social force model , 2009, CVPR.

[15]  Walid Gomaa,et al.  Semantic Analysis for Crowded Scenes Based on Non-Parametric Tracklet Clustering , 2016, IJCAI.

[16]  Xiaogang Wang,et al.  Understanding pedestrian behaviors from stationary crowd groups , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Walid Gomaa,et al.  Gate and Common Pathway Detection in Crowd Scenes Using Motion Units and Meta-Tracking , 2017, 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[18]  Michael Werman,et al.  The Quadratic-Chi Histogram Distance Family , 2010, ECCV.

[19]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Bo Zhang,et al.  Crowd Scene Understanding with Coherent Recurrent Neural Networks , 2016, IJCAI.

[21]  Kien A. Hua,et al.  Convolutional DLSTM for Crowd Scene Understanding , 2017, 2017 IEEE International Symposium on Multimedia (ISM).

[22]  Pierre-Marc Jodoin,et al.  Meta-tracking for video scene understanding , 2013, 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[23]  Mubarak Shah,et al.  Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Kuldeep Singh,et al.  Convolutional neural networks for crowd behaviour analysis: a survey , 2019, The Visual Computer.

[25]  D. Comaniciu,et al.  The variable bandwidth mean shift and data-driven scale selection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[26]  Zi-Xing Cai,et al.  Mean Shift Algorithm and its Application in Tracking of Objects , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[27]  Nuno Vasconcelos,et al.  Anomaly Detection and Localization in Crowded Scenes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Yasushi Makihara,et al.  Identifying motion pathways in highly crowded scenes: A non-parametric tracklet clustering approach , 2020, Comput. Vis. Image Underst..

[29]  J. E. Chac'on,et al.  A comparison of bandwidth selectors for mean shift clustering , 2013, 1310.7855.

[30]  Louis Kratz,et al.  Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models , 2009, CVPR.

[31]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  David Mason,et al.  On the Estimation of the Gradient Lines of a Density and the Consistency of the Mean-Shift Algorithm , 2016, J. Mach. Learn. Res..

[33]  Hakan Erdogan,et al.  Tracklet clustering for robust multiple object tracking using distance dependent Chinese restaurant processes , 2015, Signal, Image and Video Processing.

[34]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Mubarak Shah,et al.  Scene understanding by statistical modeling of motion patterns , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[37]  Mubarak Shah,et al.  A Streakline Representation of Flow in Crowded Scenes , 2010, ECCV.

[38]  Xiaogang Wang,et al.  Understanding collective crowd behaviors: Learning a Mixture model of Dynamic pedestrian-Agents , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Bingbing Ni,et al.  Crowded Scene Analysis: A Survey , 2015, IEEE Transactions on Circuits and Systems for Video Technology.