Combining local and global motion models for feature point tracking

Accurate feature point tracks through long sequences are a valuable substrate for many computer vision applications, e.g. non-rigid body tracking, video segmentation, video matching, and even object recognition. Existing algorithms may be arranged along an axis indicating how global the motion model used to constrain tracks is. Local methods, such as the KLT tracker, depend on local models of feature appearance, and are easily distracted by occlusions, repeated structure, and image noise. This leads to short tracks, many of which are incorrect. Alone, these require considerable postprocessing to obtain a useful result. In restricted scenes, for example a rigid scene through which a camera is moving, such postprocessing can make use of global motion models to allow "guided matching " which yields long high-quality feature tracks. However, many scenes of interest contain multiple motions or significant non-rigid deformations which mean that guided matching cannot be applied. In this paper we propose a general amalgam of local and global models to improve tracking even in these difficult cases. By viewing rank-constrained tracking as a probabilistic model of 2D tracks rather than 3D motion, we obtain a strong, robust motion prior, derived from the global motion in the scene. The result is a simple and powerful prior whose strength is easily tuned, enabling its use in any existing tracking algorithm.

[1]  Paul A. Beardsley,et al.  3D Model Acquisition from Extended Image Sequences , 1996, ECCV.

[2]  Paul Smith,et al.  Effective Corner Matching , 1998, BMVC.

[3]  Michal Irani,et al.  Multi-frame optical flow estimation using subspace constraints , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[4]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[6]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[7]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Takahiro Ishikawa,et al.  The template update problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[10]  Aaron Hertzmann,et al.  Automatic Non-rigid 3D Modeling from Video , 2004, ECCV.

[11]  Seth Teller,et al.  Video matching , 2004, SIGGRAPH 2004.

[12]  Andrew Zisserman,et al.  Object Level Grouping for Video Shots , 2004, International Journal of Computer Vision.

[13]  Alessio Del Bue,et al.  Non-rigid 3D Factorization for Projective Reconstruction , 2005, BMVC.

[14]  Roberto Cipolla,et al.  Unsupervised Bayesian Detection of Independent Motion in Crowds , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Seth J. Teller,et al.  Particle Video: Long-Range Motion Estimation Using Point Trajectories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).