Non-rigid structure from motion using ranklet-based tracking and non-linear optimization

In this paper, we address the problem of estimating the 3D structure and motion of a deformable object given a set of image features tracked automatically throughout a video sequence. Our contributions are twofold: firstly, we propose a new approach to improve motion and structure estimates using a non-linear optimization scheme and secondly we propose a tracking algorithm based on ranklets, a recently developed family of orientation selective rank features. It has been shown that if the 3D deformations of an object can be modeled as a linear combination of shape bases then both its motion and shape may be recovered using an extension of Tomasi and Kanade's factorization algorithm for affine cameras. Crucially, these new factorization methods are model free and work purely from video in an unconstrained case: a single uncalibrated camera viewing an arbitrary 3D surface which is moving and articulating. The main drawback of existing methods is that they do not provide correct structure and motion estimates: the motion matrix has a repetitive structure which is not respected by the factorization algorithm. In this paper, we present a non-linear optimization method to refine the motion and shape estimates which minimizes the image reprojection error and imposes the correct structure onto the motion matrix by choosing an appropriate parameterization. Factorization algorithms require as input a set of feature tracks or correspondences found throughout the image sequence. The challenge here is to track the features while the object is deforming and the appearance of the image therefore changing. We propose a model free tracking algorithm based on ranklets, a multi-scale family of rank features that present an orientation selectivity pattern similar to Haar wavelets. A vector of ranklets is used to encode an appearance based description of a neighborhood of each tracked point. Robustness is enhanced by adapting, for each point, the shape of the filters to the structure of the particular neighborhood. A stack of models is maintained for each tracked point in order to manage large appearance variations with limited drift. Our experiments on sequences of a human subject performing different facial expressions show that this tracker provides a good set of feature correspondences for the non-rigid 3D reconstruction algorithm.

[1]  Michal Irani,et al.  Multi-frame optical flow estimation using subspace constraints , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Takahiro Ishikawa,et al.  The template update problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Alessio Del Bue,et al.  NON-RIGID 3D SHAPE RECOVERY USING STEREO FACTORIZATION , 2004 .

[4]  Fabrizio Smeraldi A Nonparametric Approach to Face Detection Using Ranklets , 2003, AVBPA.

[5]  J. Douglas Faires,et al.  Numerical Analysis , 1981 .

[6]  H. Toutenburg,et al.  Lehmann, E. L., Nonparametrics: Statistical Methods Based on Ranks, San Francisco. Holden‐Day, Inc., 1975. 480 S., $ 22.95 . , 1977 .

[7]  Rama Chellappa,et al.  A new approach to image feature detection with applications , 1996, Pattern Recognit..

[8]  Matthew Brand,et al.  Flexible flow for 3D nonrigid tracking and shape recovery , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  C Tomasi,et al.  Shape and motion from image streams: a factorization method. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[10]  J. Daugman Two-dimensional spectral analysis of cortical receptive field profiles , 1980, Vision Research.

[11]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[13]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[14]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[15]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[16]  Gregory D. Hager,et al.  Efficient Region Tracking With Parametric Models of Geometry and Illumination , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Lorenzo Torresani,et al.  Tracking and modeling non-rigid objects with rank constraints , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[18]  Rajesh P. N. Rao,et al.  An Active Vision Architecture Based on Iconic Representations , 1995, Artif. Intell..

[19]  Alex Po Leung,et al.  An Optimization Framework for Real-Time Appearance-Based Tracking under Weak Perspective , 2005, BMVC.

[20]  Fabrizio Smeraldi Ranklets: orientation selective non-parametric features applied to face detection , 2002, Object recognition supported by user interaction for service robots.

[21]  Shree K. Nayar,et al.  Ordinal measures for visual correspondence , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Henrik Aanæs,et al.  Estimation of Deformable Structure and Motion , 2002 .

[23]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[24]  Jorge J. Moré,et al.  The Levenberg-Marquardt algo-rithm: Implementation and theory , 1977 .

[25]  Jing Xiao,et al.  A Closed-Form Solution to Non-rigid Shape and Motion Recovery , 2004, ECCV.

[26]  Richard Szeliski,et al.  Vision Algorithms: Theory and Practice , 2002, Lecture Notes in Computer Science.

[27]  Jing Xiao,et al.  Non-rigid shape and motion recovery: degenerate deformations , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[28]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[29]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[30]  Matthew Brand,et al.  Morphable 3D models from video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.