Efficient deep models for monocular road segmentation

This paper addresses the problem of road scene segmentation in conventional RGB images by exploiting recent advances in semantic segmentation via convolutional neural networks (CNNs). Segmentation networks are very large and do not currently run at interactive frame rates. To make this technique applicable to robotics we propose several architecture refinements that provide the best trade-off between segmentation quality and runtime. This is achieved by a new mapping between classes and filters at the expansion side of the network. The network is trained end-to-end and yields precise road/lane predictions at the original input resolution in roughly 50ms. Compared to the state of the art, the network achieves top accuracies on the KITTI dataset for road and lane segmentation while providing a 20× speed-up. We demonstrate that the improved efficiency is not due to the road segmentation task. Also on segmentation datasets with larger scene complexity, the accuracy does not suffer from the large speed-up.

[1]  Rahul Mohan,et al.  Deep Deconvolutional Networks for Scene Parsing , 2014, ArXiv.

[2]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[5]  Giovani Bernardes Vitor,et al.  A probabilistic distribution approach for the classification of urban roads in complex environments , 2014 .

[6]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Joachim Denzler,et al.  Convolutional Patch Networks with Spatial Prior for Road Detection and Urban Scene Understanding , 2015, VISAPP.

[8]  Sanja Fidler,et al.  Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  S. BEUCHER,et al.  ROAD SEGMENTATION BY WATERSHEDS ALGORITHMS , 1990 .

[10]  Liang Xiao,et al.  CRF based road detection with multi-sensor fusion , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[11]  Mohamed Aly,et al.  Real time detection of lane markers in urban streets , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[12]  Jannik Fritsch,et al.  A new performance measure and evaluation benchmark for road detection algorithms , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[13]  Wolfram Burgard,et al.  Deep learning for human part discovery in images , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Luis Miguel Bergasa,et al.  CRF-based semantic labeling in miniaturized road scenes , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[15]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[16]  Yann LeCun,et al.  Road Scene Segmentation from a Single Image , 2012, ECCV.

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Martial Hebert,et al.  Stacked Hierarchical Labeling , 2010, ECCV.