An Empirical Study of Non-Rigid Surface Feature Matching of Human from 3D Video

JVRB, 7(2010), no. 3. - This paper presents an empirical study of affine invariant feature detectors to perform matching on video sequences of people with non-rigid surface deformation. Recent advances in feature detection and wide baseline matching have focused on static scenes. Video frames of human movement capture highly non-rigid deformation such as loose hair, cloth creases, skin stretching and free flowing clothing. This study evaluates the performance of six widely used feature detectors for sparse temporal correspondence on single view and multiple view video sequences. Quantitative evaluation is performed of both the number of features detected and their temporal matching against and without ground truth correspondence. Recall-accuracy analysis of feature matching is reported for temporal correspondence on single view and multiple view sequences of people with variation in clothing and movement. This analysis identifies that existing feature detection and matching algorithms are unreliable for fast movement with common clothing.

[1]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Vincent Lepetit,et al.  A fast local descriptor for dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Christian Rössl,et al.  Dense correspondence finding for parametrization-free animation reconstruction from video , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[6]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[7]  Adrian Hilton,et al.  Correspondence labelling for wide-timeframe free-form surface matching , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Adrian Hilton,et al.  Video-based character animation , 2005, SCA '05.

[9]  Vincent Lepetit,et al.  Fast Keypoint Recognition in Ten Lines of Code , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Hans-Peter Seidel,et al.  Automatic generation of personalized human avatars from multi-view video , 2005, VRST '05.

[11]  Hong Zhang,et al.  Quantitative Evaluation of Feature Extractors for Visual SLAM , 2007, Fourth Canadian Conference on Computer and Robot Vision (CRV '07).

[12]  Hans-Peter Seidel,et al.  Marker-less Deformable Mesh Tracking for Human Shape and Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Sebastian Thrun,et al.  The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces , 2004, NIPS.

[14]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[15]  Adrian Hilton,et al.  Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[16]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  C. Bregler,et al.  Large displacement optical flow , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  R. Horaud,et al.  Surface feature detection and description with applications to mesh matching , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Adrian Hilton,et al.  Volumetric Stereo with Silhouette and Feature Constraints , 2006, BMVC.

[20]  Taku Komura,et al.  Topology matching for fully automatic similarity estimation of 3D shapes , 2001, SIGGRAPH.

[21]  TuytelaarsTinne,et al.  Local invariant feature detectors , 2008 .

[22]  Effrosini Kokiopoulou,et al.  Mobile Museum Guide Based on Fast SIFT Recognition , 2008, Adaptive Multimedia Retrieval.

[23]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[24]  Radu Horaud,et al.  Surface feature detection and description with applications to mesh matching , 2009, CVPR.

[25]  Radu Horaud,et al.  Temporal Surface Tracking Using Mesh Evolution , 2008, ECCV.

[26]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[27]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[28]  Jing Li,et al.  A comprehensive review of current local features for computer vision , 2008, Neurocomputing.

[29]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[30]  Takeo Kanade,et al.  Spatio-Temporal View Interpolation , 2002, Rendering Techniques.

[31]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Jonathon S. Hare,et al.  Scale Saliency: Applications in Visual Matching, Tracking and View-Based Object Recognition , 2003 .

[33]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.