Topology Preserving Local Road Network Estimation from Single Onboard Camera Image

Knowledge of the road network topology is crucial for autonomous planning and navigation. Yet, recovering such topology from a single image has only been explored in part. Furthermore, it needs to refer to the ground plane, where also the driving actions are taken. This paper aims at extracting the local road network topology, directly in the bird’ s-eye- view (BEV), all in a complex urban set-ting. The only input consists of a single onboard, for-ward looking camera image. We represent the road topology using a set of directed lane curves and their interactions, which are captured using their intersection points. To better capture topology, we introduce the concept of minimal cycles and their covers. A minimal cycle is the smallest cycle formed by the directed curve segments (be-tween two intersections). The cover is a set of curves whose segments are involved in forming a minimal cycle. We first show that the covers suffice to uniquely represent the road topology. The covers are then used to supervise deep neural networks, along with the lane curve supervision. These learn to predict the road topology from a single input image. The results on the NuScenes and Argo-verse benchmarks are significantly better than those ob-tained with baselines. Code: https://github.com/ybarancan/TopologicalLaneGraph.

[1]  Luc Van Gool,et al.  Structured Bird’s-Eye-View Traffic Scene Understanding from Onboard Images , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Luc Van Gool,et al.  Decoder Fusion RNN: Context and Interaction Aware Decoders for Trajectory Prediction , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Raquel Urtasun,et al.  MP3: A Unified Model to Map, Perceive, Predict and Plan , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Luc Van Gool,et al.  Understanding Bird's-Eye View Semantic HD-Maps Using an Onboard Monocular Camera , 2020, ArXiv.

[5]  Shaul Oron,et al.  3D-LaneNet+: Anchor Free Lane Detection using a Semi-Local Representation , 2020, ArXiv.

[6]  Sanja Fidler,et al.  Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D , 2020, ECCV.

[7]  Maximilian Jaritz,et al.  2D-3D scene understanding for autonomous driving , 2020 .

[8]  Pengfei Duan,et al.  FISHING Net: Future Inference of Semantic Heatmaps In Grids , 2020, ArXiv.

[9]  Hengyuan Zhang,et al.  Probabilistic Semantic Mapping for Urban Autonomous Driving Applications , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Nicolas Usunier,et al.  End-to-End Object Detection with Transformers , 2020, ECCV.

[11]  Luc Van Gool,et al.  Action Sequence Predictions of Vehicles in Urban Environments using Map and Social Context , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Roberto Cipolla,et al.  Predicting Semantic Map Representations From Images Using Pyramid Occupancy Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Dimitris N. Metaxas,et al.  MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  K. Madhava Krishna,et al.  Mono Lay out: Amodal scene layout from a single image , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[15]  Moongu Jeon,et al.  Key Points Estimation and Point Instance Segmentation Approach for Lane Detection , 2020, ArXiv.

[16]  Vladlen Koltun,et al.  Learning by Cheating , 2019, CoRL.

[17]  Raquel Urtasun,et al.  DAGMapper: Learning to Map by Discovering Lane Topology , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Raquel Urtasun,et al.  Exploiting Sparse Semantic HD Maps for Self-Driving Vehicle Localization , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Bolei Zhou,et al.  Cross-View Semantic Segmentation for Sensing Surroundings , 2019, IEEE Robotics and Automation Letters.

[20]  Simon Lucey,et al.  Argoverse: 3D Tracking and Forecasting With Rich Maps , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Raquel Urtasun,et al.  Convolutional Recurrent Network for Road Boundary Extraction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Benjamin Sapp,et al.  Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  C. V. Jawahar,et al.  Improved Road Connectivity by Joint Learning of Orientation and Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Chun Liu,et al.  Leveraging Crowdsourced GPS Data for Road Extraction From Aerial Imagery , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Luc Van Gool,et al.  End-to-end Lane Detection through Differentiable Least-Squares Fitting , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[27]  Mayank Bansal,et al.  ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst , 2018, Robotics: Science and Systems.

[28]  Dan Levi,et al.  3D-LaneNet: End-to-End 3D Multiple Lane Detection , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Bin Yang,et al.  HDNET: Exploiting HD Maps for 3D Object Detection , 2018, CoRL.

[30]  Sergio Casas,et al.  IntentNet: Learning to Predict Intention from Raw Sensor Data , 2018, CoRL.

[31]  Henggang Cui,et al.  Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[32]  Raquel Urtasun,et al.  End-to-End Deep Structured Models for Drawing Crosswalks , 2018, ECCV.

[33]  Victor Talpaert,et al.  Real-time Dynamic Object Detection for Autonomous Driving using Prior 3D-Maps , 2018, ECCV Workshops.

[34]  Luc Van Gool,et al.  Iterative Deep Learning for Road Topology Extraction , 2018, BMVC.

[35]  Raquel Urtasun,et al.  Hierarchical Recurrent Attention Networks for Structured Online Maps , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Klaus Werner Schmidt,et al.  A lane detection algorithm based on reliable lane markings , 2018, 2018 26th Signal Processing and Communications Applications Conference (SIU).

[37]  Sanja Fidler,et al.  Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++ , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Luc Van Gool,et al.  Towards End-to-End Lane Detection: an Instance Segmentation Approach , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[39]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[40]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Xiaolong Hu,et al.  Autonomous Driving in the iCity—HD Maps as a Key Challenge of the Automotive Industry , 2016 .

[42]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Costas Armenakis,et al.  Survey of Work on Road Extraction in Aerial and Satellite Images , 2002 .

[44]  John A. Richards,et al.  Remote Sensing Digital Image Analysis , 1986 .