Road scene segmentation via fusing camera and lidar data

This paper presents an approach for pixel-wise object segmentation for road scenes based on the integration of a color image and an aligned 3D point cloud. In light of the advantage of range information in object discovery, we first produce initial object hypotheses by clustering the sparse 3D point cloud. The image pixels registered to the clustered 3D points are taken as samples to learn each object's prior knowledge. The priors are represented by Gaussian Mixture Models (GMMs) of color and 3D location information only, requiring no high-level features. We further formulate the segmentation problem within a Conditional Random Field (CRF) framework, which incorporates the learned prior models, together with hard constraints placed on the registered pixels and pairwise spatial constraints to achieve final results. Our algorithm is validated on the challenging KITTI dataset which contains diverse complicated road scenarios. Both qualitative and quantitative evaluation results show the superiority of our algorithm.

[1]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Ryan M. Eustice,et al.  Ford Campus vision and lidar data set , 2011, Int. J. Robotics Res..

[3]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[4]  Yann LeCun,et al.  Indoor Semantic Segmentation using depth information , 2013, ICLR.

[5]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[6]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Xiaojin Gong,et al.  Integrating visual and range data for road detection , 2013, 2013 IEEE International Conference on Image Processing.

[8]  Xiaojin Gong,et al.  Guided Depth Enhancement via Anisotropic Diffusion , 2013, PCM.

[9]  Ali Shahrokni,et al.  Urban 3D semantic modelling using stereo vision , 2013, 2013 IEEE International Conference on Robotics and Automation.

[10]  Tsuhan Chen,et al.  Automatic discovery of groups of objects for scene understanding , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Bertrand Douillard,et al.  On the segmentation of 3D LIDAR point clouds , 2011, 2011 IEEE International Conference on Robotics and Automation.

[12]  Radu Bogdan Rusu,et al.  Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments , 2010, KI - Künstliche Intelligenz.

[13]  Yann LeCun,et al.  Road Scene Segmentation from a Single Image , 2012, ECCV.

[14]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[15]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[16]  Roberto Cipolla,et al.  Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[17]  Jean Ponce,et al.  Vanishing point detection for road detection , 2009, CVPR.

[18]  Fei-Fei Li,et al.  Object discovery in 3D scenes via shape analysis , 2013, 2013 IEEE International Conference on Robotics and Automation.

[19]  Philip H. S. Torr,et al.  Combining Appearance and Structure from Motion Features for Road Scene Understanding , 2009, BMVC.

[20]  Pushmeet Kohli,et al.  Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Theo Gevers,et al.  3D Scene priors for road detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Ruigang Yang,et al.  Semantic Segmentation of Urban Scenes Using Dense Depth Maps , 2010, ECCV.

[24]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Luc Van Gool,et al.  Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  W. F. Clocksin,et al.  Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction , 2012, International Journal of Computer Vision.

[27]  Hong Zhang,et al.  An evaluation metric for image segmentation of multiple objects , 2009, Image Vis. Comput..

[28]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Michael Himmelsbach,et al.  LIDAR-based 3D Object Perception , 2008 .

[30]  Junsong Yuan,et al.  Fusion of Velodyne and camera data for scene parsing , 2012, 2012 15th International Conference on Information Fusion.