Outdoor Scenes Pixel-wise Semantic Segmentation using Polarimetry and Fully Convolutional Network

In this paper, we propose a novel method for pixel-wise scene segmentation application using polarimetry. To address the difficulty of detecting highly reflective areas such as water and windows, we use the angle and degree of polarization of these areas, obtained by processing images from a polarimetric camera. A deep learning framework, based on encoder-decoder architecture, is used for the segmentation of regions of interest. Different methods of augmentation have been developed to obtain a sufficient amount of data, while preserving the physical properties of the polarimetric images. Moreover, we introduce a new dataset comprising both RGB and polarimetric images with manual ground truth annotations for seven different classes. Experimental results on this dataset, show that deep learning can benefit from polarimetry and obtain better segmentation results compared to RGB modality. In particular, we obtain an improvement of 38.35% and 22.92% in the accuracy for segmenting windows and cars respectively.

[1]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[2]  Larry Matthies,et al.  Daytime water detection based on color variation , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Fabrice Meriaudeau,et al.  Polarization imaging applied to 3D reconstruction of specular metallic surfaces , 2005, IS&T/SPIE Electronic Imaging.

[5]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[6]  Yann LeCun,et al.  Indoor Semantic Segmentation using depth information , 2013, ICLR.

[7]  Larry H. Matthies,et al.  Passive sensor evaluation for unmanned ground vehicle mud detection , 2010, J. Field Robotics.

[8]  Sze Hon Yan Water body detection using Two Camera Polarized Stereo Vision , 2014 .

[9]  Michael Milford,et al.  3D tracking of water hazards with polarized stereo cameras , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Larry H. Matthies,et al.  Depth from stereo polarization in specular scenes for urban robotics , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Gregory P. Nordin,et al.  Diffractive optical element for Stokes vector measurement with a focal plane array , 1999, Optics + Photonics.

[16]  Andreas G. Andreou,et al.  Polarization camera sensors , 1995, Image Vis. Comput..

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  G. Nordin,et al.  Micropolarizer Array for Infrared Imaging Polarimetry , 1999 .

[19]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Stefan Rahmann,et al.  Reconstruction of specular surfaces using polarization imaging , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[21]  J Scott Tyo,et al.  Interpolation strategies for reducing IFOV artifacts in microgrid polarimeter imagery. , 2009, Optics express.

[22]  Wolfram Burgard,et al.  Efficient deep models for monocular road segmentation , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[23]  Gavriel Salomon,et al.  T RANSFER OF LEARNING , 1992 .

[24]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling , 2015, CVPR 2015.

[25]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.