A fusion network for road detection via spatial propagation and spatial transformation

Abstract In this paper, we address the fusion of image and point cloud data for road detection. To take advantage of both deep network and multi-modal data fusion, we propose an end-to-end road segmentation network called SPSTFN (Spatial Propagation and Spatial Transformation Fusion Network). Our method considers the model-level fusion and dual-view fusion in the network simultaneously for the first time. Specifically, the proposed SPSTFN contains three parts: the point cloud branch, the image branch, and the fusion block. Firstly, we design a simple but efficient lightweight network to handle the unordered and sparse point cloud to obtain a coarse representation of the road area. Then, an equal-resolution convolutional block is adopted to capture the low-level features of the image which are used to produce the heat diffusion coefficients of the joint anisotropic diffusion based spatial propagation model. Thirdly, we conduct the diffusion process on the coarse representation under the guidance of the learned low-level image features, both in the perspective and bird views, via the spatial transformation in the network. Finally, the diffusion results of the two views are then integrated to generate the final refined representation of the road area. The proposed fusion method is totally data-driven and parameter-free, and the whole fusion network can be trained with the standard BP (Back Propagation) algorithm. Without any additional process steps and pre-training, the proposed method obtains competitive results on the KITTI Road Benchmark.

[1]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Liang Xiao,et al.  CRF based road detection with multi-sensor fusion , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[3]  Hongdong Li,et al.  Semisupervised and Weakly Supervised Road Detection Based on Generative Adversarial Networks , 2018, IEEE Signal Processing Letters.

[4]  Bastian Leibe,et al.  Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Lennart Svensson,et al.  LIDAR-Camera Fusion for Road Detection Using Fully Convolutional Neural Networks , 2018, Robotics Auton. Syst..

[7]  Zhe Chen,et al.  Progressive LiDAR adaptation for road detection , 2019, IEEE/CAA Journal of Automatica Sinica.

[8]  Jonathan T. Barron,et al.  Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jan Kautz,et al.  Learning Affinity via Spatial Propagation Networks , 2017, NIPS.

[11]  Jannik Fritsch,et al.  A new performance measure and evaluation benchmark for road detection algorithms , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[12]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[14]  Subhransu Maji,et al.  SPLATNet: Sparse Lattice Networks for Point Cloud Processing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Ruigang Yang,et al.  Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Lin Wang,et al.  Multidimensional particle swarm optimization-based unsupervised planar segmentation algorithm of unorganized point clouds , 2012, Pattern Recognit..

[17]  Leo Grady,et al.  Random Walks for Image Segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jian Yang,et al.  Lidar-histogram for fast road and obstacle detection , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Zhe Chen,et al.  RBNet: A Deep Neural Network for Unified Road and Road Boundary Detection , 2017, ICONIP.

[20]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[21]  Roberto Cipolla,et al.  MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving , 2016, 2018 IEEE Intelligent Vehicles Symposium (IV).

[22]  Kurt Keutzer,et al.  SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Carsten Rother,et al.  Conditional Random Fields Meet Deep Neural Networks for Semantic Segmentation: Combining Probabilistic Graphical Models with Deep Learning for Structured Prediction , 2018, IEEE Signal Processing Magazine.

[24]  Takeo Kanade,et al.  Distributed cosegmentation via submodular optimization on anisotropic diffusion , 2011, 2011 International Conference on Computer Vision.

[25]  G. Balogh,et al.  Drug impurity profiling strategies. , 1997, Talanta.

[26]  Junyu Gao,et al.  Embedding structured contour and location prior in siamesed fully convolutional networks for road detection , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Huchuan Lu,et al.  Deep gated attention networks for large-scale street-level scene segmentation , 2019, Pattern Recognit..

[28]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Xiaogang Wang,et al.  Spatial As Deep: Spatial CNN for Traffic Scene Understanding , 2017, AAAI.

[31]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Huan Wang,et al.  Road detection based on the fusion of Lidar and image data , 2017 .

[33]  Jianfei Cai,et al.  A diffusion approach to seeded image segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Marina Apaydin,et al.  A Multi-Dimensional Framework of Organizational Innovation: A Systematic Review of the Literature , 2010 .

[35]  Liang Xiao,et al.  Hybrid conditional random field based camera-LIDAR fusion for road detection , 2017, Inf. Sci..

[36]  Julie L. Yang,et al.  Affinity regression predicts the recognition code of nucleic acid binding proteins , 2015, Nature Biotechnology.

[37]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[38]  Lennart Svensson,et al.  LIDAR-based driving path generation using fully convolutional neural networks , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[39]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[40]  Jian Yang,et al.  When Dijkstra Meets Vanishing Point: A Stereo Vision Approach for Road Detection , 2018, IEEE Transactions on Image Processing.

[41]  Sebastian Thrun,et al.  An Application of Markov Random Fields to Range Sensing , 2005, NIPS.

[42]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[43]  Jun Tan,et al.  Robust Curb Detection with Fusion of 3D-Lidar and Camera Data , 2014, Sensors.

[44]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[45]  Lennart Svensson,et al.  Fast LIDAR-based road detection using fully convolutional neural networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[46]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[47]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[48]  Qi Wang,et al.  VSSA-NET: Vertical Spatial Sequence Attention Network for Traffic Sign Detection , 2019, IEEE Transactions on Image Processing.

[49]  Hui Kong,et al.  Histograms of the Normalized Inverse Depth and Line Scanning for Urban Road Detection , 2019, IEEE Transactions on Intelligent Transportation Systems.

[50]  K. Madhava Krishna,et al.  CRF based method for curb detection using semantic cues and stereo depth , 2016, ICVGIP '16.