Vision-based Uneven BEV Representation Learning with Polar Rasterization and Surface Estimation

In this work, we propose PolarBEV for vision-based uneven BEV representation learning. To adapt to the foreshortening effect of camera imaging, we rasterize the BEV space both angularly and radially, and introduce polar embedding decomposition to model the associations among polar grids. Polar grids are rearranged to an array-like regular representation for efficient processing. Besides, to determine the 2D-to-3D correspondence, we iteratively update the BEV surface based on a hypothetical plane, and adopt height-based feature transformation. PolarBEV keeps real-time inference speed on a single 2080Ti GPU, and outperforms other methods for both BEV semantic segmentation and BEV instance segmentation. Thorough ablations are presented to validate the design. The code will be released at \url{https://github.com/SuperZ-Liu/PolarBEV}.

[1]  Chang Huang,et al.  Polar Parametrization for Vision-based Surround-View 3D Detection , 2022, ArXiv.

[2]  Xinggang Wang,et al.  Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer , 2022, ArXiv.

[3]  Philipp Krahenbuhl,et al.  Cross-view Transformers for real-time Map-view Semantic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  R. Bowden,et al.  'The Pedestrian next to the Lamppost” Adaptive Object Graphs for Better Instantaneous Mapping , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jifeng Dai,et al.  BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers , 2022, ECCV.

[6]  Chang Huang,et al.  AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  R. Bowden,et al.  Translating Images into Maps , 2021, 2022 International Conference on Robotics and Automation (ICRA).

[8]  Luc Van Gool,et al.  Structured Bird’s-Eye-View Traffic Scene Understanding from Onboard Images , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Richard Bowden,et al.  Enabling spatio-temporal aggregation in Birds-Eye-View Vehicle Estimation , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[10]  R. Cipolla,et al.  FIERY: Future Instance Prediction in Bird’s-Eye View from Surround Monocular Cameras , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Yves Grandvalet,et al.  Driving among Flatmobiles: Bird-Eye-View occupancy grids from a monocular camera for holistic trajectory planning , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[12]  Sanja Fidler,et al.  Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D , 2020, ECCV.

[13]  Xinge Zhu,et al.  Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic Segmentation , 2020, ArXiv.

[14]  Bingbing Zhuang,et al.  Understanding Road Layout From Videos as a Whole , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Lutz Eckstein,et al.  A Sim2Real Deep Learning Approach for the Transformation of Images from Multiple Vehicle-Mounted Cameras to a Semantically Segmented Image in Bird’s Eye View , 2020, 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC).

[16]  Philip David,et al.  PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Roberto Cipolla,et al.  Predicting Semantic Map Representations From Images Using Pyramid Occupancy Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Dragomir Anguelov,et al.  Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Bolei Zhou,et al.  Cross-View Semantic Segmentation for Sensing Surroundings , 2019, IEEE Robotics and Automation Letters.

[20]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Dongsuk Kum,et al.  Deep Learning based Vehicle Position and Orientation Estimation via Inverse Perspective Mapping Image , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[22]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[23]  Andrew Zisserman,et al.  A Geometric Approach to Obtain a Bird's Eye View From an Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[24]  Buyu Liu,et al.  A Parametric Top-View Representation of Complex Road Scenes , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Roberto Cipolla,et al.  Orthographic Feature Transform for Monocular 3D Object Detection , 2018, BMVC.

[26]  Chenyang Lu,et al.  Monocular Semantic Occupancy Grid Mapping With Convolutional Variational Encoder–Decoder Networks , 2018, IEEE Robotics and Automation Letters.

[27]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[28]  Nicholay Topin,et al.  Super-convergence: very fast training of neural networks using large learning rates , 2018, Defense + Commercial Sensing.

[29]  Nathan Jacobs,et al.  Learning to Look around Objects for Top-View Representations of Outdoor Scenes , 2018, ECCV.

[30]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Takeo Kanade,et al.  A stereo matching algorithm with an adaptive window: theory and experiment , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.