Abnormal Behavior Recognition using CNN-LSTM with Attention Mechanism

There is a rising trend of security issues in our society nowadays. Every now and then, there are news such as robberies, fighting or terrorism around the world. Hence, some robust measurements need to be done to ensure public safety. This is when computer vision techniques come into play. Conventional surveillance cameras lack the capability of autonomously detecting abnormal behaviors in footages, and hence the determination of abnormal activities is solely dependent on human judgement. There is no absolute meaning of what abnormal behavior is, it depends on the settings. For example, fighting in a martial art class is a normal behavior, however if there is fighting in a bank, it is considered as abnormal behavior. In this study, we focus on two scopes: two-persons interactions and crowd-based interactions. Our Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) with attention mechanism model can automatically extract the important features from the video frames and interpret the temporal information between the video sequences. Different from the typical neural networks, our model includes attention mechanism that focuses on salient part of human action. Five benchmark datasets are used to validate the performance of the proposed model.

[1]  Cewu Lu,et al.  Online Video Object Detection Using Association LSTM , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Junzhi Yu,et al.  Temporally Identity-Aware SSD With Attentional LSTM , 2018, IEEE Transactions on Cybernetics.

[3]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[4]  Kwang-Eun Ko,et al.  Deep convolutional framework for abnormal behavior detection in a smart surveillance system , 2018, Eng. Appl. Artif. Intell..

[5]  Damla Arifoglu,et al.  Activity Recognition and Abnormal Behaviour Detection with Recurrent Neural Networks , 2017, FNC/MobiSPC.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Hang-Bong Kang,et al.  Abnormal behavior detection using hybrid agents in crowded scenes , 2014, Pattern Recognit. Lett..

[9]  Ruslan Salakhutdinov,et al.  Action Recognition using Visual Attention , 2015, NIPS 2015.

[10]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[12]  Rahul Sukthankar,et al.  Violence Detection in Video Using Computer Vision Techniques , 2011, CAIP.

[13]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Thian Song Ong,et al.  A Robust Abnormal Behavior Detection Method Using Convolutional Neural Network , 2019 .