Semantic Cameras for 360-Degree Environment Perception in Automated Urban Driving

The European UP-Drive project addresses transportation-related challenges by providing key contributions that enable fully automated vehicle navigation and parking in complex urban areas, which results in a safer, inclusive, affordable and environmentally friendly transportation system. For this purpose, the project consortium developed a prototype electrical vehicle equipped with cameras and LiDARs sensors that is capable to autonomously drive around the city and find available parking spots. In UP-Drive, we created an accurate, robust and redundant multi-modal environment perception system that provides 360° coverage around the vehicle. This paper summarizes the work of the project related to the surround view semantic perception using fisheye and narrow field-of-view semantic virtual cameras. Deep learning-based semantic, instance and panoptic segmentation networks, which satisfy requirements in accuracy and efficiency have been developed and integrated into the final prototype. The UP-Drive automated vehicle has been successfully demonstrated in urban areas after extensive experiments and numerous field tests.

[1]  Qingqi Pei,et al.  Traffic Flow Prediction Based on Deep Learning in Internet of Vehicles , 2021, IEEE Transactions on Intelligent Transportation Systems.

[2]  Sergiu Nedevschi,et al.  Real-Time Panoptic Segmentation with Prototype Masks for Automated Driving , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[3]  H. Bao,et al.  Deep Snake for Real-Time Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  R. Urtasun,et al.  PolyTransform: Deep Polygon Transformer for Instance Segmentation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Adrien Gaidon,et al.  Real-Time Panoptic Segmentation From Dense Detections , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[7]  Thomas S. Huang,et al.  Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Babak Shahian Jahromi,et al.  Real-Time Hybrid Multi-Sensor Fusion Framework for Perception in Autonomous Vehicles , 2019, Sensors.

[9]  Sergiu Nedevschi,et al.  Multi-task Network for Panoptic Segmentation in Automated Driving , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[10]  Yassine Ruichek,et al.  EU Long-term Dataset with Multiple Sensors for Autonomous Driving , 2019, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Ming Yang,et al.  SSAP: Single-Shot Instance Segmentation With Affinity Pyramid , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Luc Van Gool,et al.  Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Lorenzo Porzi,et al.  Seamless Scene Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Xinlei Chen,et al.  TensorMask: A Foundation for Dense Object Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Min Bai,et al.  UPSNet: A Unified Panoptic Segmentation Network , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Alexander C. Berg,et al.  RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free , 2019, ArXiv.

[17]  Kaiming He,et al.  Panoptic Feature Pyramid Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Arthur Daniel Costea,et al.  Fusion Scheme for Semantic and Instance-level Segmentation , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[19]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, International Journal of Computer Vision.

[20]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[21]  Jonathan Tompson,et al.  PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model , 2018, ECCV.

[22]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Carsten Rother,et al.  Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Arthur Daniel Costea,et al.  Super-sensor for 360-degree environment perception: Point cloud segmentation using image features , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[27]  Peter Kontschieder,et al.  The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Arthur Daniel Costea,et al.  Semi-automatic image annotation of street scenes , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[30]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[32]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[34]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Min Bai,et al.  Deep Watershed Transform for Instance Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Carsten Rother,et al.  InstanceCut: From Edges to Instances with MultiCut , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Ian D. Reid,et al.  RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[45]  Eduardo Romera,et al.  ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation , 2018, IEEE Transactions on Intelligent Transportation Systems.