UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework

Multi-agent collaborative perception (MCP) has recently attracted much attention. It includes three key processes: communication for sharing, collaboration for integration, and reconstruction for different downstream tasks. Existing methods pursue designing the collaboration process alone, ignoring their intrinsic interactions and resulting in suboptimal performance. In contrast, we aim to propose a Unified Collaborative perception framework named UMC, optimizing the communication, collaboration, and reconstruction processes with the Multi-resolution technique. The communication introduces a novel trainable multi-resolution and selective-region (MRSR) mechanism, achieving higher quality and lower bandwidth. Then, a graph-based collaboration is proposed, conducting on each resolution to adapt the MRSR. Finally, the reconstruction integrates the multi-resolution collaborative features for downstream tasks. Since the general metric can not reflect the performance enhancement brought by MCP systematically, we introduce a brand-new evaluation metric that evaluates the MCP from different perspectives. To verify our algorithm, we conducted experiments on the V2X-Sim and OPV2V datasets. Our quantitative and qualitative experiments prove that the proposed UMC greatly outperforms the state-of-the-art collaborative perception approaches.

[1]  Siheng Chen,et al.  Where2comm: Communication-Efficient Collaborative Perception via Spatial Confidence Maps , 2022, NeurIPS.

[2]  Siheng Chen,et al.  Latency-Aware Collaborative Perception , 2022, ECCV.

[3]  Bolei Zhou,et al.  CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers , 2022, CoRL.

[4]  P. Stone,et al.  Coopernaut: End-to-End Driving with Cooperative Perception for Networked Vehicles , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Zaiqing Nie,et al.  DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ming-Hsuan Yang,et al.  V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer , 2022, ECCV.

[7]  Siheng Chen,et al.  V2X-Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving , 2022, IEEE Robotics and Automation Letters.

[8]  Xin Xia,et al.  OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication , 2021, 2022 International Conference on Robotics and Automation (ICRA).

[9]  Zhuang Liu,et al.  Anytime Dense Prediction with Confidence Adaptivity , 2021, ICLR.

[10]  Yue Wang,et al.  Multi-Robot Scene Completion: Towards Task-Agnostic Collaborative Perception , 2022, CoRL.

[11]  Chen Feng,et al.  Learning Distilled Collaboration Graph for Multi-Agent Perception , 2021, NeurIPS.

[12]  Jianwei Ding,et al.  ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-tree Complex Wavelet Representation and Contradict Channel Loss , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Liqing Zhang,et al.  Parallel Multi-Resolution Fusion Network for Image Inpainting , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Wenhan Luo,et al.  Deep Dense Multi-Scale Network for Snow Removal Using Semantic and Depth Priors , 2021, IEEE Transactions on Image Processing.

[15]  Dong Liu,et al.  Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ling Shao,et al.  Multi-Stage Progressive Image Restoration , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Alan Yuille,et al.  Robust Instance Segmentation through Reasoning about Multi-Object Occlusion , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Raquel Urtasun,et al.  V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction , 2020, ECCV.

[19]  Robby T. Tan,et al.  All in One Bad Weather Removal Using Architectural Search , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yen-Cheng Liu,et al.  When2com: Multi-Agent Perception via Communication Graph Grouping , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Alan Yuille,et al.  Robust Object Detection Under Occlusion With Context-Aware CompositionalNets , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yen-Cheng Liu,et al.  Who2com: Collaborative Perception via Learnable Handshake Communication , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Zheng Zhang,et al.  Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation , 2020, ECCV.

[24]  Dimitris N. Metaxas,et al.  MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yan Shi,et al.  A Vision of C-V2X: Technologies, Field Testing, and Challenges With Chinese Development , 2020, IEEE Internet of Things Journal.

[26]  Jianping An,et al.  Voxel-FPN: Multi-Scale Voxel Feature Aggregation for 3D Object Detection from LIDAR Point Clouds , 2020, Sensors.

[27]  Xiaogang Wang,et al.  PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Qi Chen,et al.  F-cooper: feature based cooperative perception for autonomous vehicle edge computing system using 3D point clouds , 2019, SEC.

[29]  Qing Yang,et al.  Cooper: Cooperative Perception for Connected Autonomous Vehicles Based on 3D Point Clouds , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).

[30]  Pengfei Xiong,et al.  Deep Fusion Network for Image Completion , 2019, ACM Multimedia.

[31]  Baining Guo,et al.  Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jiong Yang,et al.  PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Yun-Pang Flötteröd,et al.  Microscopic Traffic Simulation using SUMO , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[35]  Bo Li,et al.  SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[36]  Horst-Michael Groß,et al.  Complex-YOLO: An Euler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds , 2018, ECCV Workshops.

[37]  Bin Yang,et al.  PIXOR: Real-time 3D Object Detection from Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Haiqing Li,et al.  Deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-free Approach , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[40]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[41]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[42]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[47]  G. Crooks On Measures of Entropy and Information , 2015 .

[48]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[49]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.