Spatio-temporal fusion of multiple view video rate 3D surfaces

We consider the problem of geometric integration and representation of multiple views of non-rigidly deforming 3D surface geometry captured at video rate. Instead of treating each frame as a separate mesh we present a representation which takes into consideration temporal and spatial coherence in the data where possible. We first segment gross base transformations using correspondence based on a closest point metric and represent these motions as piecewise rigid transformations. The remaining residual is encoded as displacement maps at each frame giving a displacement video. At both these stages occlusions and missing data are interpolated to give a representation which is continuous in space and time. We demonstrate the integration of multiple views for four different non-rigidly deforming scenes: hand, face, cloth and a composite scene. The approach achieves the integration of multiple-view data at different times into one representation which can processed and edited.

[1]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[2]  Wei Sun,et al.  Layered animation of captured data , 2001, The Visual Computer.

[3]  Andrew Blake,et al.  Real-time tracking of surfaces with structured light , 1994, Image Vis. Comput..

[4]  Gérard G. Medioni,et al.  Object modeling by registration of multiple range images , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[5]  Hugues Hoppe,et al.  Displaced subdivision surfaces , 2000, SIGGRAPH.

[6]  Marc Levoy,et al.  The digital Michelangelo project: 3D scanning of large statues , 2000, SIGGRAPH.

[7]  Adrian Hilton,et al.  Mesh Decimation for Displacement Mapping , 2002, Eurographics.

[8]  Luc Van Gool,et al.  Active acquisition of 3D shape for moving objects , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[9]  Marc Levoy,et al.  Fitting smooth surfaces to dense polygon meshes , 1996, SIGGRAPH.

[10]  Gérard G. Medioni,et al.  Object modelling by registration of multiple range images , 1992, Image Vis. Comput..

[11]  Marc Levoy,et al.  Real-time 3D model acquisition , 2002, ACM Trans. Graph..

[12]  Adrian Hilton,et al.  Video-rate capture of dynamic face shape and appearance , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[13]  Li Zhang,et al.  Spacetime stereo: shape recovery for dynamic scenes , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[14]  Adrian Hilton,et al.  Speech-driven face synthesis from 3D video , 2004 .

[15]  Ahmed M. Elgammal,et al.  High Resolution Acquisition, Learning and Transfer of Dynamic 3‐D Facial Expressions , 2004, Comput. Graph. Forum.

[16]  Steven M. Seitz,et al.  Spacetime faces , 2004, ACM Trans. Graph..

[17]  Adrian Hilton,et al.  A Rigid Transform Basis for Animation Compression and Level of Detail , 2005 .

[18]  Marc Levoy The Digital Michelangelo Project , 1999, Comput. Graph. Forum.

[19]  Shree K. Nayar,et al.  Real-Time Focus Range Sensor , 1996, IEEE Trans. Pattern Anal. Mach. Intell..