STCNet: spatiotemporal cross network for industrial smoke detection

Industrial smoke emissions present a serious threat to natural ecosystems and human health. Prior works have shown that using computer vision techniques to identify smoke is a low cost and convenient method. However, industrial smoke detection is a challenging task because industrial emission particles are often decay rapidly outside the stacks or facilities and steam is very similar to smoke. To overcome these problems, a novel Spatio-Temporal Cross Network (STCNet) is proposed to recognize industrial smoke emissions. The proposed STCNet involves a spatial pathway to extract texture features and a temporal pathway to capture smoke motion information. We assume that spatial and temporal pathway could guide each other. For example, the spatial path can easily recognize the obvious interference such as trees and buildings, and the temporal path can highlight the obscure traces of smoke movement. If the two pathways could guide each other, it will be helpful for the smoke detection performance. In addition, we design an efficient and concise spatio-temporal dual pyramid architecture to ensure better fusion of multi-scale spatiotemporal information. Finally, extensive experiments on public dataset show that our STCNet achieves clear improvements on the challenging RISE industrial smoke detection dataset against the best competitors by 6.2%. The code will be available at: https://github.com/Caoyichao/STCNet.

[1]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[4]  Lei Wang,et al.  Detection and Separation of Smoke From Single Image Frames , 2018, IEEE Transactions on Image Processing.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Xuelong Li,et al.  A Wave-Shaped Deep Neural Network for Smoke Density Estimation , 2020, IEEE Transactions on Image Processing.

[7]  Randy Sargent,et al.  RISE Video Dataset: Recognizing Industrial Smoke Emissions , 2020, ArXiv.

[8]  Nikhil Ketkar,et al.  Introduction to PyTorch , 2021, Deep Learning with Python.

[9]  Nikolaos Grammalidis,et al.  Higher Order Linear Dynamical Systems for Smoke Detection in Video Surveillance Applications , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[11]  Qi Tian,et al.  CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Lei Wang,et al.  Smoke Detection in Video: An Image Separation Approach , 2013, International Journal of Computer Vision.

[13]  Randy Sargent,et al.  Project RISE: Recognizing Industrial Smoke Emissions , 2020 .

[14]  José García Rodríguez,et al.  A Review on Deep Learning Techniques Applied to Semantic Segmentation , 2017, ArXiv.

[15]  Andrew Zisserman,et al.  Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[18]  Lei Wang,et al.  Single Image Smoke Detection , 2014, ACCV.

[19]  Тараса Шевченка,et al.  Quo vadis? , 2013, Clinical chemistry.

[20]  Qixing Zhang,et al.  Adversarial Adaptation From Synthesis to Reality in Fast Detector for Smoke Detection , 2019, IEEE Access.

[21]  Antoine Vacavant,et al.  A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos , 2014, Comput. Vis. Image Underst..

[22]  Marc Van Droogenbroeck,et al.  ViBe: A Universal Background Subtraction Algorithm for Video Sequences , 2011, IEEE Transactions on Image Processing.

[23]  Meng Zhang,et al.  Neural Network Methods for Natural Language Processing , 2017, Computational Linguistics.

[24]  Chuang Gan,et al.  TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[27]  Zhitao Xiao,et al.  A Dual Convolution Network Using Dark Channel Prior for Image Smoke Classification , 2019, IEEE Access.

[28]  Jungho Im,et al.  Detection and Monitoring of Forest Fires Using Himawari-8 Geostationary Satellite Data in South Korea , 2019, Remote. Sens..

[29]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Yi Zhao,et al.  Saliency Detection and Deep Learning-Based Wildfire Identification in UAV Imagery , 2018, Sensors.

[31]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Feiniu Yuan,et al.  A double mapping framework for extraction of shape-invariant features based on multi-scale partitions with AdaBoost for video smoke detection , 2012, Pattern Recognit..

[33]  Gang Li,et al.  Non-Linear Dimensionality Reduction and Gaussian Process Based Classification Method for Smoke Detection , 2017, IEEE Access.

[34]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Feiniu Yuan,et al.  A Deep Normalization and Convolutional Neural Network for Image Smoke Detection , 2017, IEEE Access.

[37]  Arnold W. M. Smeulders,et al.  Timeception for Complex Action Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Thomas Brox,et al.  ECO: Efficient Convolutional Network for Online Video Understanding , 2018, ECCV.

[39]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[40]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[41]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[42]  Jing Huang,et al.  Transmission: A New Feature for Computer Vision Based Smoke Detection , 2010, AICI.

[43]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[44]  Bolei Zhou,et al.  Temporal Relational Reasoning in Videos , 2017, ECCV.

[45]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[46]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Qixing Zhang,et al.  Smoke Detection on Video Sequences Using 3D Convolutional Neural Networks , 2019, Fire Technology.