4D Match Trees for Non-rigid Surface Alignment

This paper presents a method for dense 4D temporal alignment of partial reconstructions of non-rigid surfaces observed from single or multiple moving cameras of complex scenes. 4D Match Trees are introduced for robust global alignment of non-rigid shape based on the similarity between images across sequences and views. Wide-timeframe sparse correspondence between arbitrary pairs of images is established using a segmentation-based feature detector (SFD) which is demonstrated to give improved matching of non-rigid shape. Sparse SFD correspondence allows the similarity between any pair of image frames to be estimated for moving cameras and multiple views. This enables the 4D Match Tree to be constructed which minimises the observed change in non-rigid shape for global alignment across all images. Dense 4D temporal correspondence across all frames is then estimated by traversing the 4D Match tree using optical flow initialised from the sparse feature matches. The approach is evaluated on single and multiple view images sequences for alignment of partial surface reconstructions of dynamic objects in complex indoor and outdoor scenes to obtain a temporally consistent 4D representation. Comparison to previous 2D and 3D scene flow demonstrates that 4D Match Trees achieve reduced errors due to drift and improved robustness to large non-rigid deformations.

[1]  Martin Klaudiny,et al.  Structured Representation of Non-Rigid Surfaces from Single View 3D Point Tracks , 2014, 2014 2nd International Conference on 3D Vision.

[2]  Bruno Lévy,et al.  Geometry-aware direction field processing , 2009, TOGS.

[3]  Takeo Kanade,et al.  Virtualized Reality: Constructing Virtual Worlds from Real Scenes , 1997, IEEE Multim..

[4]  Jan-Michael Frahm,et al.  3D Reconstruction of Dynamic Textures in Crowd Sourced Data , 2014, ECCV.

[5]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Alvaro Collet,et al.  High-quality streamable free-viewpoint video , 2015, ACM Trans. Graph..

[7]  Qionghai Dai,et al.  Robust Non-rigid Motion Tracking and Surface Reconstruction Using L0 Regularization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Hans-Peter Seidel,et al.  Animation cartography—intrinsic reconstruction of shape and motion , 2012, TOGS.

[9]  Hujun Bao,et al.  Robust Bilayer Segmentation and Motion/Depth Estimation with a Handheld Camera , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Cheng Lei,et al.  A new multiview spacetime-consistent depth recovery framework for free viewpoint video rendering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Adrian Hilton,et al.  Segmentation Based Features for Wide-Baseline Multi-view Reconstruction , 2015, 2015 International Conference on 3D Vision.

[12]  Takeo Kanade,et al.  Panoptic Studio: A Massively Multiview System for Social Motion Capture , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[14]  Roman P. Pflugfelder,et al.  Clustering of static-adaptive correspondences for deformable object tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Cristian Sminchisescu,et al.  Large Displacement 3D Scene Flow with Occlusion Reasoning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Yaser Sheikh,et al.  MAP Visibility Estimation for Large-Scale Dynamic 3D Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Adrian Hilton,et al.  Model-based multiple view reconstruction of people , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Slobodan Ilic,et al.  Probabilistic Deformable Surface Tracking from Multiple Videos , 2010, ECCV.

[19]  Gunnar Farnebäck,et al.  Two-Frame Motion Estimation Based on Polynomial Expansion , 2003, SCIA.

[20]  Tom Drummond,et al.  Faster and Better: A Machine Learning Approach to Corner Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Didier Stricker,et al.  Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Takashi Matsuyama,et al.  Complete multi-view reconstruction of dynamic scenes from probabilistic fusion of narrow and wide baseline stereo , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Kurt Keutzer,et al.  Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow , 2010, ECCV.

[24]  Jean-Yves Guillemaut,et al.  General Dynamic Scene Reconstruction from Multiple View Video , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Yael Moses,et al.  Multi-view Scene Flow Estimation: A View Centered Variational Approach , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Martin Klaudiny,et al.  Global Non-rigid Alignment of Surface Sequences , 2013, International Journal of Computer Vision.

[27]  Slobodan Ilic,et al.  A Bayesian Approach to Multi-view 4D Modeling , 2015, International Journal of Computer Vision.

[28]  Edmond Boyer,et al.  Exact polyhedral visual hulls , 2003, BMVC.

[29]  Daniel Cremers,et al.  Stereoscopic Scene Flow Computation for 3D Motion Understanding , 2011, International Journal of Computer Vision.

[30]  Jan-Michael Frahm,et al.  Sparse Dynamic 3D Reconstruction from Unsynchronized Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Marc Pollefeys,et al.  Modeling Dynamic Scenes Recorded with Freely Moving Cameras , 2010, ACCV.

[32]  R. Prim Shortest connection networks and some generalizations , 1957 .

[33]  M. Pollefeys,et al.  Unstructured video-based rendering: interactive exploration of casually captured videos , 2010, ACM Trans. Graph..

[34]  Jean-Yves Guillemaut,et al.  Temporally Coherent 4D Reconstruction of Complex Dynamic Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Hujun Bao,et al.  3D Reconstruction of Dynamic Scenes with Multiple Handheld Cameras , 2012, ECCV.

[36]  Daniel Cremers,et al.  Generalized Connectivity Constraints for Spatio-temporal 3D Reconstruction , 2014, ECCV.

[37]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[38]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[40]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[41]  Cordelia Schmid,et al.  DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[42]  Georgios D. Evangelidis,et al.  Parametric Image Alignment Using Enhanced Correlation Coefficient Maximization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Rui Yu,et al.  Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes , 2014, ECCV.

[44]  Qi-Xing Huang,et al.  Dense Human Body Correspondences Using Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).