Video-based characters: creating new human performances from a multi-view video database

We present a method to synthesize plausible video sequences of humans according to user-defined body motions and viewpoints. We first capture a small database of multi-view video sequences of an actor performing various basic motions. This database needs to be captured only once and serves as the input to our synthesis algorithm. We then apply a marker-less model-based performance capture approach to the entire database to obtain pose and geometry of the actor in each database frame. To create novel video sequences of the actor from the database, a user animates a 3D human skeleton with novel motion and viewpoints. Our technique then synthesizes a realistic video sequence of the actor performing the specified motion based only on the initial database. The first key component of our approach is a new efficient retrieval strategy to find appropriate spatio-temporally coherent database frames from which to synthesize target video frames. The second key component is a warping-based texture synthesis approach that uses the retrieved most-similar database frames to synthesize spatio-temporally coherent target video frames. For instance, this enables us to easily create video sequences of actors performing dangerous stunts without them being placed in harm's way. We show through a variety of result videos and a user study that we can synthesize realistic videos of people, even if the target motions and camera views are different from the database content.

[1]  Pieter Peers,et al.  Dynamic shape capture using multi-view photometric stereo , 2009, ACM Trans. Graph..

[2]  Hans-Peter Seidel,et al.  Motion capture using joint skeleton tracking and surface estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Michael Gleicher,et al.  Retargetting motion to new characters , 1998, SIGGRAPH.

[4]  Diego Gutierrez,et al.  A practical appearance model for dynamic facial color , 2010, ACM Trans. Graph..

[5]  Jitendra Malik,et al.  Video Based Motion Synthesis by Splicing and Morphing , 2004 .

[6]  Adrian Hilton,et al.  Human motion synthesis from 3D video , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Ira Kemelmacher-Shlizerman,et al.  Being John Malkovich , 2010, ECCV.

[8]  Leif Kobbelt,et al.  Character animation from 2D pictures and 3D motion data , 2007, TOGS.

[9]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[10]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.

[11]  Edilson de Aguiar,et al.  New Trends in 3D Video , 2007, Eurographics.

[12]  Slobodan Ilic,et al.  Free-form mesh tracking: A patch-based approach , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  D. Cohen-Or,et al.  Parametric reshaping of human bodies in images , 2010, ACM Trans. Graph..

[14]  Irfan A. Essa,et al.  Controlled animation of video sprites , 2002, SCA '02.

[15]  Dani Lischinski,et al.  Data-driven enhancement of facial attractiveness , 2008, ACM Trans. Graph..

[16]  Jan Kautz,et al.  Video-based characters: creating new human performances from a multi-view video database , 2011, SIGGRAPH 2011.

[17]  Marcus A. Magnor,et al.  View and Time Interpolation in Image Space , 2008, Comput. Graph. Forum.

[18]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[19]  Markus H. Gross,et al.  Interactive 3D video editing , 2006, The Visual Computer.

[20]  Richard Szeliski,et al.  Video textures , 2000, SIGGRAPH.

[21]  Tim Weyrich,et al.  Rendering deformable surface reflectance fields , 2005, IEEE Transactions on Visualization and Computer Graphics.

[22]  Simon Baker,et al.  Visual hull construction, alignment and refinement for human kinematic modeling, motion tracking and rendering , 2003 .

[23]  Victor B. Zordan,et al.  Animated People Textures , 2004 .

[24]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[25]  Scott Schaefer,et al.  Image deformation using moving least squares , 2006, ACM Trans. Graph..

[26]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[27]  Atsushi Nakazawa,et al.  Human video textures , 2009, I3D '09.

[28]  Adrian Hilton,et al.  Video-based character animation , 2005, SCA '05.

[29]  Derek Bradley,et al.  Markerless garment capture , 2008, ACM Trans. Graph..

[30]  Takashi Matsuyama,et al.  Complete multi-view reconstruction of dynamic scenes from probabilistic fusion of narrow and wide baseline stereo , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31]  Martin Jägersand,et al.  Dynamic Textures for Image‐based Rendering of Fine‐Scale 3D Structure and Animation of Non‐rigid Motion , 2002, Comput. Graph. Forum.

[32]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[33]  Jovan Popovic,et al.  Automatic rigging and animation of 3D characters , 2007, ACM Trans. Graph..

[34]  Leif Kobbelt,et al.  Interactive Pixel‐Accurate Free Viewpoint Rendering from Images with Silhouette Aware Sampling , 2009, Comput. Graph. Forum.

[35]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[36]  Andrew Jones,et al.  Relighting human locomotion with flowed reflectance fields , 2006, EGSR '06.

[37]  S. Gortler,et al.  3D Deformation Using Moving Least Squares , 2007 .

[38]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[39]  Hans-Peter Seidel,et al.  MovieReshape: tracking and reshaping of humans in videos , 2010, ACM Trans. Graph..

[40]  Sebastian Thrun,et al.  Video-based reconstruction of animatable human characters , 2010, ACM Trans. Graph..

[41]  Marc Levoy,et al.  High performance imaging using large camera arrays , 2005, ACM Trans. Graph..

[42]  M. Pollefeys,et al.  Unstructured video-based rendering: interactive exploration of casually captured videos , 2010, ACM Trans. Graph..

[43]  Takeo Kanade,et al.  Constructing virtual worlds using dense stereo , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[44]  Luca Ballan,et al.  Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes , 2008 .