Two-Stream Convolutional Neural Networks for Emergency Recognition in Images

Emergencies threaten the safety of public lives and properties. If news agencies can timely report emergencies, their subsequent hazards can be significantly reduced. However, in face of massive pictures, the traditional manual screening can no longer meet the needs for news agencies. Therefore, it is necessary to use a more effective method to classify emergencies, which help news agencies choose the right pictures and release them to the public in time. This paper proposes a method to classify emergencies in still images using two-stream convolutional neural networks(CNNs). Firstly, the architecture of our two- stream CNNs is decomposed into object net and scene net, which extract useful information from the perspective of objects and scene context, respectively. Meanwhile, we investigate different methods for two-stream CNNs feature fusion to improve the performance of emergency recognition. Secondly, another binary classifier works after the two-stream CNNs to verify whether the result of the two-stream CNN is the true positive instance of the predicted emergency class. Experimental results confirm the effectiveness of the proposed emergency recognition method.

[1]  Andrew Zisserman,et al.  Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  Yu Qiao,et al.  Object-Scene Convolutional Neural Networks for event recognition in images , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Tao Mei,et al.  Exploring Visual Relationship for Image Captioning , 2018, ECCV.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Geoffrey Zweig,et al.  From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[10]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[11]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[12]  Chen Sun,et al.  Complex Event Recognition from Images with Few Training Examples , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).