The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping

Many tasks performed by autonomous vehicles such as road marking detection, object tracking, and path planning are simpler in bird's-eye view. Hence, Inverse Perspective Mapping (IPM) is often applied to remove the perspective effect from a vehicle's front-facing camera and to remap its images into a 2D domain, resulting in a top-down view. Unfortunately, however, this leads to unnatural blurring and stretching of objects at further distance, due to the resolution of the camera, limiting applicability. In this paper, we present an adversarial learning approach for generating a significantly improved IPM from a single camera image in real time. The generated bird'seye-view images contain sharper features (e.g, road markings) and a more homogeneous illumination, while (dynamic) objects are automatically removed from the scene, thus revealing the underlying road layout in an improved fashion. We demonstrate our framework using real-world data from the Oxford Robot-Car Dataset and show that scene understanding tasks directly benefit from our boosted IPM approach.

[1]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Trevor Darrell,et al.  Compositional GAN: Learning Conditional Image Composition , 2018, ArXiv.

[3]  Xinge Zhu,et al.  Generative Adversarial Frontal View to Bird View Synthesis , 2018, 2018 International Conference on 3D Vision (3DV).

[4]  Yi Yang,et al.  Lane Detection and Classification for Forward Collision Warning System Based on Stereo Vision , 2018, IEEE Sensors Journal.

[5]  Paul Newman,et al.  Mark Yourself: Road Marking Segmentation via Weakly-Supervised Annotations from Multimodal Data , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ali Borji,et al.  Cross-view image synthesis using geometry-guided conditional GANs , 2018, Comput. Vis. Image Underst..

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[10]  Pietro Cerri,et al.  Free Space Detection on Highways using Time Correlation between Stabilized Sub-pixel precision IPM Images , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[11]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[13]  Jinyong Jeong,et al.  Adaptive Inverse Perspective Mapping for lane map generation with SLAM , 2016, 2016 13th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI).

[14]  Nathan Jacobs,et al.  Learning to Look around Objects for Top-View Representations of Outdoor Scenes , 2018, ECCV.

[15]  Philip H. S. Torr,et al.  Automatic dense visual semantic mapping from street-level imagery , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Paul Newman,et al.  Reading the Road: Road Marking Classification and Interpretation , 2015, IEEE Transactions on Intelligent Transportation Systems.

[17]  Dushyant Rao,et al.  Deep tracking in the wild: End-to-end tracking using recurrent neural networks , 2018, Int. J. Robotics Res..

[18]  Luc Van Gool,et al.  Towards End-to-End Lane Detection: an Instance Segmentation Approach , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[19]  José Manuel Menéndez García,et al.  A new strategy of detecting traffic information based on traffic camera : modified inverse perspective mapping , 2017 .

[20]  Ming-Shi Wang,et al.  A Vision Based Top-View Transformation Model for a Vehicle Parking Assistant , 2012, Sensors.

[21]  F. Jaureguizar,et al.  Stabilization of Inverse Perspective Mapping Images based on Robust Vanishing Point Estimation , 2007, 2007 IEEE Intelligent Vehicles Symposium.

[22]  Miguel Oliveira,et al.  Multimodal inverse perspective mapping , 2015, Inf. Fusion.

[23]  Max Jaderberg,et al.  Unsupervised Learning of 3D Structure from Images , 2016, NIPS.

[24]  Sanja Fidler,et al.  HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Trevor Darrell,et al.  Compositional GAN: Learning Image-Conditional Binary Composition , 2018, International Journal of Computer Vision.

[26]  Stewart Worrall,et al.  Naturalistic Driver Intention and Path Prediction Using Recurrent Neural Networks , 2018, IEEE Transactions on Intelligent Transportation Systems.

[27]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Nicolas Simond,et al.  Obstacle detection from IPM and super-homography , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Massimo Bertozzi,et al.  AN EXTENSION TO THE INVERSE PERSPECTIVE MAPPING TO HANDLE NON-FLAT ROADS , 1998 .

[30]  Honglak Lee,et al.  Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[31]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Paul Newman,et al.  Reading between the Lanes: Road Layout Reconstruction from Partially Segmented Scenes , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[33]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34]  Marc Pollefeys,et al.  Semantic Stixels: Depth is not enough , 2016, 2016 IEEE Intelligent Vehicles Symposium (IV).

[35]  Klaus C. J. Dietmayer,et al.  Deep Object Tracking on Dynamic Occupancy Grid Maps Using RNNs , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[36]  Winson S. Churchill Experience based navigation : theory, practice and implementation , 2012 .

[37]  Philip H. S. Torr,et al.  DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[39]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[40]  Bin Fang,et al.  Robust inverse perspective mapping based on vanishing point , 2014, Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics (SPAC).

[41]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[42]  Thomas Brox,et al.  Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[43]  Augusto Luis Ballardini,et al.  An online probabilistic road intersection detector , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Scott Workman,et al.  Predicting Ground-Level Scene Layout from Aerial Imagery , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).