Detecting spatiotemporal irregularities in videos via a 3D convolutional autoencoder

Abstract Spatiotemporal irregularities (i.e. , the uncommon appearance and motion patterns) in videos are difficult to detect, as they are usually not well defined and appear rarely in videos. We tackle this problem by learning normal patterns from regular videos, while treating irregularities as deviations from normal patterns. To this end, we introduce a 3D fully convolutional autoencoder (3D-FCAE) that is trainable in an end-to-end manner to detect both temporal and spatiotemporal irregularities in videos using limited training data. Subsequently, temporal irregularities can be detected as frames with high reconstruction errors, and irregular spatiotemporal patterns can be detected as blurry regions that are not well reconstructed. Our approach can accurately locate temporal and spatiotemporal irregularities thanks to the 3D fully convolutional autoencoder and the explored effective architecture. We evaluate the proposed autoencoder for detecting irregular patterns on benchmark video datasets with weak supervision. Comparisons with state-of-the-art approaches demonstrate the effectiveness of our approach. Moreover, the learned autoencoder shows good generalizability across multiple datasets.

[1]  Simone Calderara,et al.  Detecting anomalies in people's trajectories using spectral graph analysis , 2011, Comput. Vis. Image Underst..

[2]  Aggelos K. Katsaggelos,et al.  A Dynamic Hierarchical Clustering Method for Trajectory-Based Unusual Video Event Detection , 2009, IEEE Transactions on Image Processing.

[3]  Nicu Sebe,et al.  Learning Deep Representations of Appearance and Motion for Anomalous Event Detection , 2015, BMVC.

[4]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Ramin Mehran,et al.  Abnormal crowd behavior detection using social force model , 2009, CVPR.

[6]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Gian Luca Foresti,et al.  Trajectory-Based Anomalous Event Detection , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Aggelos K. Katsaggelos,et al.  Anomalous video event detection using spatiotemporal context , 2011 .

[9]  Hongbin Zha,et al.  Learning to Detect Anomalies in Surveillance Video , 2015, IEEE Signal Processing Letters.

[10]  David A. Forsyth,et al.  Video Event Detection: From Subvolume Localization to Spatiotemporal Path Search , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Mubarak Shah,et al.  Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Louis Kratz,et al.  Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models , 2009, CVPR.

[13]  Yandong Tang,et al.  Video Anomaly Search in Crowded Scenes via Spatio-Temporal Motion Context , 2013, IEEE Transactions on Information Forensics and Security.

[14]  Shuai Li,et al.  Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Nuno Vasconcelos,et al.  Anomaly Detection and Localization in Crowded Scenes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Junsong Yuan,et al.  Sparse reconstruction cost for abnormal event detection , 2011, CVPR 2011.