A Multi-modality Sensor System for Unmanned Surface Vehicle

The onboard multi-modality sensors significantly expand perception ability of Unmanned Surface Vehicle (USV). This paper aims to fully utilize various onboard sensors and enhance USV’s object detection performance. We solve several unique challenges for application of USV multi-modality sensor system in the complex maritime environment. By utilizing deep learning networks, we achieved accurate object detection on water surface. We firstly propose a multi-modality sensor calibration method. The network fuses RGB images with multiple point clouds from various sensors. The well-calibrated image and point cloud are input to our deep object detection network, and conduct 3D detection through proposal generation network and object detection network. Meanwhile, we made a series of improvements to the system framework, which accelerate the detection procedures. We collected two datasets from the real-world offshore field and the simulation scenes respectively. The experiments on both datasets showed valid calibration results. On this basis, our object detection network achieves better accuracy than other methods. The performance of the proposed multi-modality sensor system meets the application requirement of our prototype USV platform.

[1]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Huayan Pu,et al.  Development of the USV ‘JingHai-I’ and sea trials in the Southern Yellow Sea , 2017 .

[5]  Gang Xu,et al.  Epipolar Geometry in Stereo, Motion and Object Recognition , 1996, Computational Imaging and Vision.

[6]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yu-Ting Su,et al.  View-Based 3-D Model Retrieval: A Benchmark , 2018, IEEE Transactions on Cybernetics.

[8]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[9]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[10]  Roberto Cipolla,et al.  PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Ganesh Iyer,et al.  CalibNet: Self-Supervised Extrinsic Calibration using 3D Spatial Transformer Networks , 2018, ArXiv.

[13]  Joohyun Woo,et al.  Vision and 2D LiDAR based autonomous surface vehicle docking for identify symbols and dock task in 2016 Maritime RobotX Challenge , 2017, 2017 IEEE Underwater Technology (UT).

[14]  Yue Gao,et al.  Hyper-Clique Graph Matching and Applications , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Nick Schneider,et al.  Visual odometry driven online calibration for monocular LiDAR-camera systems , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Robert Sutton,et al.  Advances in Unmanned Marine Vehicles , 2006 .

[20]  Shih-Fu Chang,et al.  PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Shih-Fu Chang,et al.  Grounding Referring Expressions in Images by Variational Context , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[23]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Huimin Ma,et al.  3D Object Proposals for Accurate Object Class Detection , 2015, NIPS.

[25]  Weizhi Nie,et al.  Modeling Temporal Information of Mitotic for Mitotic Event Detection , 2017, IEEE Transactions on Big Data.

[26]  Sebastian Thrun,et al.  Automatic Online Calibration of Cameras and Lasers , 2013, Robotics: Science and Systems.

[27]  O. D. Faugeras,et al.  Camera Self-Calibration: Theory and Experiments , 1992, ECCV.

[28]  K. Madhava Krishna,et al.  CalibNet: Geometrically Supervised Extrinsic Calibration using 3D Spatial Transformer Networks , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[29]  Shiyu Song,et al.  Joint SFM and detection cues for monocular 3D localization in road scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Sanja Fidler,et al.  Monocular 3D Object Detection for Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Levente Hajder,et al.  Accurate Calibration of LiDAR-Camera Systems Using Ordinary Boxes , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[32]  Shih-Fu Chang,et al.  Visual Translation Embedding Network for Visual Relation Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Nick Schneider,et al.  RegNet: Multimodal sensor registration using deep neural networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[34]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[35]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  H. M. Karara,et al.  Direct Linear Transformation from Comparator Coordinates into Object Space Coordinates in Close-Range Photogrammetry , 2015 .

[37]  Reinhard Koch,et al.  Self-Calibration and Metric Reconstruction Inspite of Varying and Unknown Intrinsic Camera Parameters , 1999, International Journal of Computer Vision.

[38]  Vishnu Radhakrishnan,et al.  LiDAR-Camera Calibration using 3D-3D Point correspondences , 2017, ArXiv.