Robust Semantic Segmentation in Adverse Weather Conditions by means of Fast Video-Sequence Segmentation

Computer vision tasks such as semantic segmentation perform very well in good weather conditions, but if the weather turns bad, they have problems to achieve this performance in these conditions. One possibility to obtain more robust and reliable results in adverse weather conditions is to use video-segmentation approaches instead of commonly used single-image segmentation methods. Video-segmentation approaches capture temporal information of the previous video-frames in addition to current image information, and hence, they are more robust against disturbances, especially if they occur in only a few frames of the video-sequence. However, video-segmentation approaches, which are often based on recurrent neural networks, cannot be applied in real-time applications anymore, since their recurrent structures in the network are computational expensive. For instance, the inference time of the LSTM-ICNet, in which recurrent units are placed at proper positions in the single-segmentation approach ICNet, increases up to 61 percent compared to the basic ICNet. Hence, in this work, the LSTM-ICNet is sped up by modifying the recurrent units of the network so that it becomes real-time capable again. Experiments on different datasets and various weather conditions show that the inference time can be decreased by about 23 percent by these modifications, while they achieve similar performance than the LSTM-ICNet and outperform the single-segmentation approach enormously in adverse weather conditions.

[1]  Xiaojuan Qi,et al.  ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[2]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Martin Jägersand,et al.  Recurrent Fully Convolutional Networks for Video Segmentation , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[5]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[6]  Luc Van Gool,et al.  Semantic Foggy Scene Understanding with Synthetic Data , 2017, International Journal of Computer Vision.

[7]  Klaus Dietmayer,et al.  Robust Semantic Segmentation in Adverse Weather Conditions by means of Sensor Data Fusion , 2019, 2019 22th International Conference on Information Fusion (FUSION).

[8]  Yang Wang,et al.  Future Semantic Segmentation with Convolutional LSTM , 2018, BMVC.

[9]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[10]  Paul Newman,et al.  I Can See Clearly Now: Image Restoration via De-Raining , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[11]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[12]  Klaus Dietmayer,et al.  Semantic Segmentation of Video Sequences with Convolutional LSTMs , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[13]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Klaus Dietmayer,et al.  Separable Convolutional LSTMs for Faster Video Segmentation , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).