A Robust Monocular 3D Object Tracking Method Combining Statistical and Photometric Constraints

Both region-based methods and direct methods have become popular in recent years for tracking the 6-dof pose of an object from monocular video sequences. Region-based methods estimate the pose of the object by maximizing the discrimination between statistical foreground and background appearance models, while direct methods aim to minimize the photometric error through direct image alignment. In practice, region-based methods only care about the pixels within a narrow band of the object contour due to the level-set-based probabilistic formulation, leaving the foreground pixels beyond the evaluation band unused. On the other hand, direct methods only utilize the raw pixel information of the object, but ignore the statistical properties of foreground and background regions. In this paper, we find it beneficial to combine these two kinds of methods together. We construct a new probabilistic formulation for 3D object tracking by combining statistical constraints from region-based methods and photometric constraints from direct methods. In this way, we take advantage of both statistical property and raw pixel values of the image in a complementary manner. Moreover, in order to achieve better performance when tracking heterogeneous objects in complex scenes, we propose to increase the distinctiveness of foreground and background statistical models by partitioning the global foreground and background regions into a small number of sub-regions around the object contour. We demonstrate the effectiveness of the proposed novel strategies on a newly constructed real-world dataset containing different types of objects with ground-truth poses. Further experiments on several challenging public datasets also show that our method obtains competitive or even superior tracking results compared to previous works. In comparison with the recent state-of-art region-based method, the proposed hybrid method is proved to be more stable under silhouette pose ambiguities with a slightly lower tracking accuracy.

[1]  Rami R. Hagege,et al.  2D-3D Pose Estimation of Heterogeneous Objects Using a Region Based Approach , 2015, International Journal of Computer Vision.

[2]  Brett Browning,et al.  Robust Tracking in Low Light and Sudden Illumination Changes , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[3]  Vincent Lepetit,et al.  Robust 3D Tracking with Descriptor Fields , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Ulrich Schwanecke,et al.  Real-Time Monocular Pose Estimation of 3D Objects Using Temporally Consistent Local Color Histograms , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[6]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[7]  Henrik I. Christensen,et al.  Real-time 3D model-based tracking using edge and keypoint features for robotic manipulation , 2010, 2010 IEEE International Conference on Robotics and Automation.

[8]  Jong-Il Park,et al.  Optimal Local Searching for Fast and Robust Textureless 3D Object Tracking in Highly Cluttered Backgrounds , 2014, IEEE Transactions on Visualization and Computer Graphics.

[9]  Olaf Kähler,et al.  Real-Time Tracking of Single and Multiple Objects from Depth-Colour Imagery Using 3D Signed Distance Functions , 2016, International Journal of Computer Vision.

[10]  Ulrich Schwanecke,et al.  Real-Time Monocular Segmentation and Pose Tracking of Multiple Objects , 2016, ECCV.

[11]  Michel Dhome,et al.  Generic edgelet-based tracking of 3D objects in real-time , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Nassir Navab,et al.  SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Vincent Lepetit,et al.  Multiple 3D Object tracking for augmented reality , 2008, 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality.

[14]  Javier Díaz,et al.  Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Ian D. Reid,et al.  PWP3D: Real-time Segmentation and Tracking of 3D Objects , 2009, BMVC.

[16]  Olaf Kähler,et al.  3D Tracking of Multiple Objects with Identical Appearance Using RGB-D Input , 2014, 2014 2nd International Conference on 3D Vision.

[17]  Judith Kelner,et al.  Model Based Markerless 3D Tracking applied to Augmented Reality , 2010 .

[18]  FuaPascal,et al.  Monocular model-based 3D tracking of rigid objects , 2005 .

[19]  Éric Marchand,et al.  Direct model based visual tracking and pose estimation using mutual information , 2014, Image Vis. Comput..

[20]  Chunhong Pan,et al.  3D object tracking via boundary constrained region-based model , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[21]  Francisco José Madrid-Cuevas,et al.  Automatic generation and detection of highly reliable fiducial markers under occlusion , 2014, Pattern Recognit..

[22]  Ming Lu,et al.  A Direct 3D Object Tracking Method Based on Dynamic Textured Model Rendering and Extended Dense Feature Fields , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Alois Knoll,et al.  Robust contour-based object tracking integrating color and edge likelihoods , 2008, VMV.

[24]  Vincent Lepetit,et al.  Monocular Model-Based 3D Tracking of Rigid Objects: A Survey , 2005, Found. Trends Comput. Graph. Vis..

[25]  Lin Chen,et al.  Illumination insensitive efficient second-order minimization for planar object tracking , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Maxime Meilland,et al.  Improving NCC-Based Direct Visual Tracking , 2012, ECCV.

[27]  Daniel Cremers,et al.  Robust odometry estimation for RGB-D cameras , 2013, 2013 IEEE International Conference on Robotics and Automation.

[28]  Nassir Navab,et al.  Real-Time 3D Model Tracking in Color and Depth on a Single CPU Core , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Anthony J. Yezzi,et al.  Robust 3D Pose Estimation and Efficient 2D Region-Based Segmentation from a 3D Shape Prior , 2008, ECCV.

[30]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[31]  Éric Marchand,et al.  A robust model-based tracker combining geometrical and color edge information , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[32]  Ian D. Reid,et al.  Robust Real-Time Visual Tracking Using Pixel-Wise Posteriors , 2008, ECCV.

[33]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[35]  Olaf Kähler,et al.  Simultaneous 3D tracking and reconstruction on a mobile phone , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[36]  Henrik I. Christensen,et al.  Multi-modal Tracking for Object based SLAM , 2016, ArXiv.

[37]  Harald Wuest,et al.  A Direct Method for Robust Model-Based 3D Object Tracking from a Monocular RGB Image , 2016, ECCV Workshops.