Enhancing embedded AI-based object detection using multi-view approach

Object detection based on convolutional neural network (CNN) is widely used in multitude emergent applications. Yet, the deployment of CNNs on embedded devices at the edge with reduced resources and power budget poses a real challenge. In this paper, we address this issue by enhancing the detection performance without impacting the inference speed. We investigate the use of multi-view for the same scene to achieve better detection performance. A novel system of distributed smart cameras is proposed where each camera integrates a CNN for detection. Implementation results show that using light networks on the distributed cameras can lead to better detection performance and a reduction in the overall consumed power.

[1]  Yilun Wang,et al.  DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries , 2021, CoRL.

[2]  P. Fua,et al.  Human Detection and Segmentation via Multi-view Consensus , 2020, IEEE International Conference on Computer Vision.

[3]  Naman K. Gupta,et al.  ultralytics/yolov5: v3.1 - Bug Fixes and Performance Improvements , 2020 .

[4]  André Broekman,et al.  PASMVS: a dataset for multi-view stereopsis training and reconstruction applications , 2020 .

[5]  Boris Sekachev,et al.  opencv/cvat: v1.1.0 , 2020 .

[6]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[7]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[8]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Jana Kosecka,et al.  3D Bounding Box Estimation Using Deep Learning and Geometry , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[12]  P. Fua,et al.  Pose estimation for category specific multiview object localization , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.