Real-Time 3D Object Detection and Tracking in Monocular Images of Cluttered Environment

This paper presents a novel method for real-time 3D object detection and tracking in monocular images. The method build maps of a user-specified object from a video sequence, and stores the data for 3D object detection and tracking. The main advantage of the method lies in that it does not need existing 3D models of the objects. Instead, it first detects the target object using the state-of-the-art deep learning-based object detection method, and constructs its map using visual Simultaneous Localization and Mapping (vSLAM). The maps only need to be built once and multiple maps of different objects can be stored. A fast method is proposed to recognize the object in the map with the aid of deep learning-based detection. The method needs only one camera and is robust in cluttered environment. The mode of multiple maps allows the reuse of pre-reconstructed maps. Experimental results show that accurate, fast and robust detection and tracking are achieved.

[1]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[2]  Dorian Gálvez-López,et al.  Bags of Binary Words for Fast Place Recognition in Image Sequences , 2012, IEEE Transactions on Robotics.

[3]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[4]  Lixin Fan,et al.  On-line Object Reconstruction and Tracking for 3D Interaction , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[5]  Vincent Lepetit,et al.  Multiple 3D Object tracking for augmented reality , 2008, 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality.

[6]  Ian D. Reid,et al.  PWP3D: Real-Time Segmentation and Tracking of 3D Objects , 2012, International Journal of Computer Vision.

[7]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Vincent Lepetit,et al.  Gradient Response Maps for Real-Time Detection of Textureless Objects , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Didier Stricker,et al.  6DoF Object Tracking based on 3D Scans for Augmented Reality Remote Live Support , 2018, Comput..

[10]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Vincent Lepetit,et al.  Robust 3D Object Tracking from Monocular Images Using Stable Parts , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.