Scene-Aware Deep Networks for Semantic Segmentation of Images

Scene classification and semantic segmentation are two important research directions in computer vision. They are widely used in the research of automatic driving and human–computer interaction. The purpose of the scene classification is to use the image classification to determine the category of the scene in an image by analyzing the background and the target object, while semantic segmentation aims to classify the image at the pixel level and mark the position and semantic information of the scene unit. In this paper, we aimed to train the semantic segmentation neural network in different scenarios to obtain the models with the same number of scene categories, which they are used to process the images. During the process of the actual test, the semantic segmentation dataset was firstly divided into three categories based on the scene classification algorithm. Then the semantic segmentation neural network is trained under three scenarios, and three semantic segmentation network models are obtained accordingly. To test the property of our methods, the semantic segmentation models we got were selected to treat other pictures, and the results obtained from the performance of scene-aware semantic segmentation were much better than semantic segmentation without considering categories. Our study provided an essential improvement of semantic segmentation by adding category information into consideration, which will be helpful to obtain more precise models for further picture analysis.

[1]  Huimin Lu,et al.  Underwater image dehazing using joint trilateral filter , 2014, Comput. Electr. Eng..

[2]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[3]  한보형,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015 .

[4]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[5]  C. V. Jawahar,et al.  Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[9]  Huimin Lu,et al.  Motor Anomaly Detection for Unmanned Aerial Vehicles Using Reinforcement Learning , 2018, IEEE Internet of Things Journal.

[10]  Bolei Zhou,et al.  Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Yi Chen,et al.  Image classification using label constrained sparse coding , 2016, Multimedia Tools and Applications.

[13]  Huimin Lu,et al.  CONet: A Cognitive Ocean Network , 2019, IEEE Wireless Communications.

[14]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Huimin Lu,et al.  Low illumination underwater light field images reconstruction using deep convolutional neural networks , 2018, Future Gener. Comput. Syst..

[16]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[19]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[20]  Fereshteh Sadeghi,et al.  Latent Pyramidal Regions for Recognizing Scenes , 2012, ECCV.

[21]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[22]  Hao Su,et al.  Objects as Attributes for Scene Classification , 2010, ECCV Workshops.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Huimin Lu,et al.  Brain Intelligence: Go beyond Artificial Intelligence , 2017, Mobile Networks and Applications.

[25]  Sanja Fidler,et al.  The Role of Context for Object Detection and Semantic Segmentation in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Antonio Torralba,et al.  Nonparametric scene parsing: Label transfer via dense scene alignment , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[29]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).