Feature Trajectory Retrieval with Application to Accurate Structure and Motion Recovery

Common techniques in structure from motion do not explicitly handle foreground occlusions and disocclusions, leading to several trajectories of a single 3D point. Hence, different discontinued trajectories induce a set of (more inaccurate) 3D points instead of a single 3D point, so that it is highly desirable to enforce long continuous trajectories which automatically bridge occlusions after a re-identification step. The solution proposed in this paper is to connect features in the current image to trajectories which discontinued earlier during the tracking. This is done using a correspondence analysis which is designed for wide baselines and an outlier elimination strategy using the epipolar geometry. The reference to the 3D object points can be used as a new constraint in the bundle adjustment. The feature localization is done using the SIFT detector extended by a Gaussian approximation of the gradient image signal. This technique provides the robustness of SIFT coupled with increased localization accuracy. Our results show that the reconstruction can be drastically improved and the drift is reduced, especially in sequences with occlusions resulting from foreground objects. In scenarios with large occlusions, the new approach leads to reliable and accurate results while a standard reference method fails.

[1]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[2]  Marc Levoy ACM SIGGRAPH 2007 papers , 2007, SIGGRAPH 2007.

[3]  Jiří Matas,et al.  Computer Vision - ECCV 2004 , 2004, Lecture Notes in Computer Science.

[4]  Bernd Neumann,et al.  Computer Vision — ECCV’98 , 1998, Lecture Notes in Computer Science.

[5]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Andrew Zisserman,et al.  Multiple View Geometry , 1999 .

[7]  Marc Pollefeys,et al.  Fast robust large-scale mapping from video and internet photo collections , 2010 .

[8]  Philip H. S. Torr,et al.  VideoTrace: rapid interactive scene modelling from video , 2007, SIGGRAPH 2007.

[9]  Jun Liu,et al.  Automatic Camera Calibration and Scene Reconstruction with Scale-Invariant Features , 2006, ISVC.

[10]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[11]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[12]  Friedrich Fraundorfer,et al.  Integration of Tracked and Recognized Features for Locally and Globally Robust Structure from Motion , 2008, VISAPP.

[13]  Hans-Peter Seidel,et al.  Markerless Motion Capture with unsynchronized moving cameras , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Hujun Bao,et al.  Efficient Non-consecutive Feature Tracking for Structure-from-Motion , 2010, ECCV.

[15]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[16]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[17]  Hans-Peter Seidel,et al.  Registration of Sub-Sequence and Multi-Camera Reconstructions for Camera Motion Estimation , 2010, J. Virtual Real. Broadcast..

[18]  Richard Szeliski,et al.  Vision Algorithms: Theory and Practice , 2002, Lecture Notes in Computer Science.

[19]  Matthew A. Brown,et al.  Invariant Features from Interest Point Groups , 2002, BMVC.

[20]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[21]  Andrew W. Fitzgibbon,et al.  The Problem of Degeneracy in Structure and Motion Recovery from Uncalibrated Image Sequences , 1999, International Journal of Computer Vision.

[22]  M. H. Heycock,et al.  Papers , 1971, BMJ : British Medical Journal.

[23]  Andrew W. Fitzgibbon,et al.  Automatic Camera Recovery for Closed or Open Image Sequences , 1998, ECCV.

[24]  SchmidCordelia,et al.  A Performance Evaluation of Local Descriptors , 2005 .

[25]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[26]  Thorsten Thormählen,et al.  Keyframe Selection for Camera Motion and Structure Estimation from Multiple Views , 2004, ECCV.

[27]  Bodo Rosenhahn,et al.  Bivariate Feature Localization for SIFT Assuming a Gaussian Feature Shape , 2010, ISVC.

[28]  Thomas Deselaers,et al.  ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[29]  Luc Van Gool,et al.  Drift detection and removal for sequential structure from motion algorithms , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Bodo Rosenhahn,et al.  HALF-SIFT: High-Accurate Localized Features for SIFT , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.