Semantic frustum-based sparsely embedded convolutional detection

Frustum-based 3D detection methods suffer from the ignorance of a 2D detector for that the object will never be detected in point cloud if it is omitted by a 2D image proposal. In this work, we propose a novel method named semantic frustum-based sparsely embedded convolutional detection (SFB-SECOND) for 3D object detection, which is devoted to solving the limitation of frustum-based methods, i.e., heavily relying on the accurate 2D detector. Specifically, for the image and LIDAR describing the same scene, we initially use developed methods of semantic segmentation and object detection to generate the object mask, selecting all potential targets within two confidence-related regions. Through this object mask, we quickly locate the objects of interest in LIDAR and dig them up as semantic frustum. This selected frustum not only rules out more background and irrelevant objects in LIDAR but also maximizes the use of rich 3D information. Then, to accurate the orientation estimation, we introduce a refined form of region-aware loss regression to cooperate with the region-aware frustum. Besides, a new data augmentation strategy is proposed to further make haste the convergence speed and improve detection performance. In addition, the proposed SFB-SECOND achieves state-of-the-art performances on the 3D object detection benchmark KITTI with real-time speed, showing superiority over previous methods.

[1]  Bo Li,et al.  SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[2]  Mohammed Abo-Zahhad,et al.  Robust Vehicle Detection and Counting Algorithm Employing a Convolution Neural Network and Optical Flow , 2019, Sensors.

[3]  Charalambos Poullis,et al.  A Framework for Automatic Modeling from Point Cloud Data , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[6]  Kai Zhang,et al.  Deep learning for image-based cancer detection and diagnosis - A survey , 2018, Pattern Recognit..

[7]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.