论文信息 - Real-Time Panoptic Segmentation with Prototype Masks for Automated Driving

Real-Time Panoptic Segmentation with Prototype Masks for Automated Driving

In this paper we propose a fast fully convolutional neural network for panoptic segmentation that can provide an accurate semantic and instance-level representation of the environment in the 2D space. We tackle panoptic segmentation as a dense classification problem and generate masks for stuff classes as well as for each instance of things classes. Our network employs a shared backbone and Feature Pyramid Network for multi-scale feature extraction which we extend with dual-decoders that learn background and foreground specific masks. Guided by object proposals, the panoptic head assembles location-sensitive prototype masks using a learned weighting scheme. Our solution runs in real-time, in 82 ms on high resolution images, making it suitable for robotic applications and automated driving. Extensive experiments on the Cityscapes dataset demonstrate that our panoptic segmentation network is robust and accurate, with 57.3% PQ and 76.9% mIoU.

Sergiu Nedevschi | Andra Petrovai | S. Nedevschi | Andra Petrovai

[1] Arthur Daniel Costea,et al. Fusion Scheme for Semantic and Instance-level Segmentation , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[2] George Papandreou,et al. DeeperLab: Single-Shot Image Parser , 2019, ArXiv.

[3] Shu Liu,et al. Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[5] Jie Li,et al. Learning to Fuse Things and Stuff , 2018, ArXiv.

[6] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Luc Van Gool,et al. Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Min Bai,et al. UPSNet: A Unified Panoptic Segmentation Network , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Min Bai,et al. Deep Watershed Transform for Instance Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Carsten Rother,et al. InstanceCut: From Edges to Instances with MultiCut , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Carsten Rother,et al. Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Bastian Leibe,et al. Single-Shot Panoptic Segmentation , 2019, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14] Jongyoul Park,et al. CenterMask: Real-Time Anchor-Free Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Jie Li,et al. Real-Time Panoptic Segmentation From Dense Detections , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Ming Yang,et al. SSAP: Single-Shot Instance Segmentation With Affinity Pyramid , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18] Gijs Dubbelman,et al. Fast Panoptic Segmentation Network , 2019, IEEE Robotics and Automation Letters.

[19] Xinlei Chen,et al. TensorMask: A Foundation for Dense Object Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[21] Kaiming He,et al. Panoptic Feature Pyramid Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Hao Chen,et al. FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24] Thomas Brox,et al. Box2Pix: Single-Shot Instance Segmentation by Assigning Pixels to Object Boxes , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[25] Jonathan Tompson,et al. PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model , 2018, ECCV.

[26] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Thomas S. Huang,et al. Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Nuno Vasconcelos,et al. Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29] Roberto Cipolla,et al. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30] Yuwen Xiong,et al. PolyTransform: Deep Polygon Transformer for Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Lorenzo Porzi,et al. Seamless Scene Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Sergiu Nedevschi,et al. Multi-task Network for Panoptic Segmentation in Automated Driving , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[33] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Yong Jae Lee,et al. YOLACT: Real-Time Instance Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35] Konstantin Sofiiuk,et al. AdaptIS: Adaptive Instance Selection Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36] Peter Kontschieder,et al. The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).