论文信息 - Single Run Action Detector over Video Stream - A Privacy Preserving Approach

Single Run Action Detector over Video Stream - A Privacy Preserving Approach

This paper takes initial strides at designing and evaluating a vision-based system for privacy ensured activity monitoring. The proposed technology utilizing Artificial Intelligence (AI)-empowered proactive systems offering continuous monitoring, behavioral analysis, and modeling of human activities. To this end, this paper presents Single Run Action Detector (S-RAD) which is a real-time privacy-preserving action detector that performs end-to-end action localization and classification. It is based on Faster-RCNN combined with temporal shift modeling and segment based sampling to capture the human actions. Results on UCF-Sports and UR Fall dataset present comparable accuracy to State-of-the-Art approaches with significantly lower model size and computation demand and the ability for real-time execution on edge embedded device (e.g. Nvidia Jetson Xavier).

Hassan Ghasemzadeh | Hamed Tabkhi | Justin Sanchez | Aurelia Macabasco-O'Connell | Anbumalar Saravanan

[1] Chuang Gan,et al. TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Jitendra Malik,et al. Finding action tubes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[6] Mubarak Shah,et al. VideoCapsuleNet: A Simplified Network for Action Detection , 2018, NeurIPS.

[7] Hélio Pedrini,et al. Multi-Stream Deep Convolutional Network Using High-Level Features Applied to Fall Detection in Video Sequences , 2019, 2019 International Conference on Systems, Signals and Image Processing (IWSSIP).

[8] Cordelia Schmid,et al. Action Tubelet Detector for Spatio-Temporal Action Localization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[10] Rachid Oulad Haj Thami,et al. Fall Detection for Elderly People Using the Variation of Key Points of Human Skeleton , 2019, IEEE Access.

[11] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[12] Guang-Zhong Yang,et al. Sensor Positioning for Activity Recognition Using Wearable Accelerometers , 2011, IEEE Transactions on Biomedical Circuits and Systems.

[13] Hassan Ghasemzadeh,et al. Optimal Policy for Deployment of Machine Learning Models on Energy-Bounded Systems , 2020, IJCAI.

[14] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[15] Hassan Ghasemzadeh,et al. Toward seamless wearable sensing: Automatic on-body sensor localization for physical activity monitoring , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[16] Hassan Ghasemzadeh,et al. Toward Ultra-Low-Power Remote Health Monitoring: An Optimal and Adaptive Compressed Sensing Framework for Activity Recognition , 2019, IEEE Transactions on Mobile Computing.

[17] Cordelia Schmid,et al. Multi-region Two-Stream R-CNN for Action Detection , 2016, ECCV.

[18] Bogdan Kwolek,et al. Human fall detection on embedded platform using depth maps and wireless accelerometer , 2014, Comput. Methods Programs Biomed..

[19] Helio Pedrini,et al. Fall Detection in Video Sequences Based on a Three-Stream Convolutional Neural Network , 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA).

[20] Amir Roshan Zamir,et al. Action Recognition in Realistic Sports Videos , 2014 .

[21] Hamed Tabkhi,et al. REVAMP2T: Real-Time Edge Video Analytics for Multicamera Privacy-Aware Pedestrian Tracking , 2019, IEEE Internet of Things Journal.

[22] Lorenzo Torresani,et al. C3D: Generic Features for Video Analysis , 2014, ArXiv.

[23] Rui Hou,et al. Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24] Cordelia Schmid,et al. Learning to Track for Spatio-Temporal Action Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25] Li Feng,et al. Deep Learning for Fall Detection: Three-Dimensional CNN Combined With LSTM on Video Kinematic Data , 2019, IEEE Journal of Biomedical and Health Informatics.

[26] Song Han,et al. Temporal Shift Module for Efficient Video Understanding , 2018, ArXiv.