Do motion boundaries improve semantic segmentation

Precise localization is crucial to many computer vision tasks. Optical flow can help by providing motion boundaries which can serve as proxy for object boundaries. This paper investigates how useful these motion boundaries are in improving semantic segmentation. As there is no dataset readily available for this task, we compute the motion boundary maps with a pre-trained model from [17] on the CamVid dataset [3]. With these motion boundary maps and the corresponding RGB images, we train a convolutional neural network end-to-end, for the task of semantic segmentation. The experimental results show that the network has learned to incorporate the motion boundaries and that these improve the object localization.

[1]  Karteek Alahari,et al.  Weakly-Supervised Semantic Segmentation Using Motion Cues , 2016, ECCV.

[2]  Ruigang Yang,et al.  Semantic Segmentation of Urban Scenes Using Dense Depth Maps , 2010, ECCV.

[3]  Vittorio Ferrari,et al.  Fast Object Segmentation in Unconstrained Video , 2013, 2013 IEEE International Conference on Computer Vision.

[4]  Bernt Schiele,et al.  Taking a deeper look at pedestrians , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[6]  Jianbo Shi,et al.  Semantic Segmentation with Boundary Neural Fields , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  C. Lawrence Zitnick,et al.  Structured Forests for Fast Edge Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[8]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[9]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Michael J. Black,et al.  Optical Flow with Semantic Segmentation and Localized Layers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[13]  Cordelia Schmid,et al.  Learning to detect Motion Boundaries , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[15]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[16]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Jörg Stückler,et al.  Reconstructing Street-Scenes in Real-Time from a Driving Car , 2015, 2015 International Conference on 3D Vision.