Image-based spatio-temporal modeling and view interpolation of dynamic events

We present an approach for modeling and rendering a dynamic, real-world event from an arbitrary viewpoint, and at any time, using images captured from multiple video cameras. The event is modeled as a nonrigidly varying dynamic scene, captured by many images from different viewpoints, at discrete times. First, the spatio-temporal geometric properties (shape and instantaneous motion) are computed. The view synthesis problem is then solved using a reverse mapping algorithm, ray-casting across space and time, to compute a novel image from any viewpoint in the 4D space of position and time. Results are shown on real-world events captured in the CMU 3D Room, by creating synthetic renderings of the event from novel, arbitrary positions in space and time. Multiple such recreated renderings can be put together to create retimed fly-by movies of the event, with the resulting visual experience richer than that of a regular video clip, or switching between images from multiple cameras.

[1]  Olivier D. Faugeras,et al.  3-D scene representation as a collection of images , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[2]  David E. Breen,et al.  A level-set approach for the metamorphosis of solid models , 1999, SIGGRAPH '99.

[3]  John E. Howland,et al.  Computer graphics , 1990, IEEE Potentials.

[4]  Alex Pentland,et al.  Recovery of Nonrigid Motion and Structure , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Katsushi Ikeuchi,et al.  Eigen-texture method: Appearance compression based on 3D model , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[6]  Donald P. Greenberg,et al.  Improved Computational Methods for Ray Tracing , 1984, TOGS.

[7]  Amnon Shashua,et al.  On the synthesis of dynamic scenes from reference views , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8]  Leonard McMillan,et al.  Plenoptic Modeling: An Image-Based Rendering System , 2023 .

[9]  Michael Cohen,et al.  Rendering Layered Depth Images , 1997 .

[10]  Daniel Cohen-Or,et al.  Tricubic Interpolation of Discrete Surfaces for Binary Volumes , 2003, IEEE Trans. Vis. Comput. Graph..

[11]  Y. Tsai Roger An Efficient and Accurate Camera Calibration Technique For 3D Machine Vision , 1986, CVPR 1986.

[12]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[13]  Allen M. Waxman,et al.  Binocular Image Flows: Steps Toward Stereo-Motion Fusion , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[15]  N KutulakosKiriakos,et al.  Multi-View Scene Capture by Surfel Sampling , 2002 .

[16]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[17]  Bruce G. Baumgart,et al.  Geometric modeling for computer vision. , 1974 .

[18]  Jules Bloomenthal,et al.  Convolution surfaces , 1991, SIGGRAPH.

[19]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[20]  Charles R. Dyer,et al.  Interpolating view and scene motion by dynamic view morphing , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[21]  Adrian Hilton,et al.  Reliable Surface Reconstructiuon from Multiple Range Images , 1996, ECCV.

[22]  Takeo Kanade,et al.  The 3D Room: Digitizing Time-Varying 3D Events by Synchronized Multiple Video Streams , 1998 .

[23]  Yaron Caspi,et al.  Increasing Space-Time Resolution in Video , 2002, ECCV.

[24]  James F. O'Brien,et al.  Shape transformation using variational implicit functions , 1999, SIGGRAPH Courses.

[25]  Rama Chellappa,et al.  3-D Motion Estimation Using a Sequence of Noisy Stereo Images: Models, Estimation, and Uniqueness Results , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  William E. Lorensen,et al.  Marching cubes: a high resolution 3D surface construction algorithm , 1996 .

[27]  Lance Williams,et al.  View Interpolation for Image Synthesis , 1993, SIGGRAPH.

[28]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[29]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[30]  Steven M. Seitz,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[31]  Peter Rander,et al.  A Multi-Camera Method for 3D Digitization of Dynamic, Real-World Events , 1998 .

[32]  Takeo Kanade,et al.  Virtualized reality: constructing time-varying virtual worlds from real world events , 1997, Proceedings. Visualization '97 (Cat. No. 97CB36155).

[33]  Katsushi Ikeuchi,et al.  Consensus surfaces for modeling 3D objects from multiple range images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[34]  Takeo Kanade,et al.  A Volumetric Iterative Approach to Stereo Matching and Occlusion Detection , 1998 .

[35]  Greg Turk,et al.  Reconstructing Surfaces by Volumetric Regularization Using Radial Basis Functions , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[37]  E. Adelson,et al.  The Plenoptic Function and the Elements of Early Vision , 1991 .

[38]  Richard Szeliski,et al.  An Experimental Comparison of Stereo Algorithms , 1999, Workshop on Vision Algorithms.

[39]  Andrew Lippman,et al.  Movie-maps: An application of the optical videodisc to computer graphics , 1980, SIGGRAPH '80.

[40]  Jake K. Aggarwal,et al.  The reconstruction of dynamic 3D structure of biological objects using stereo microscope images , 1997, Machine Vision and Applications.

[41]  Hideo Saito,et al.  Modeling, Combining, and Rendering Dynamic Real-World Events From Image Sequences , 1998 .

[42]  Takeo Kanade,et al.  Shape and motion carving in 6D , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[43]  Takeo Kanade,et al.  Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Roger Y. Tsai,et al.  Multiframe image restoration and registration , 1984 .

[45]  Paul Debevec,et al.  Modeling and Rendering Architecture from Photographs , 1996, SIGGRAPH 1996.

[46]  T. Poggio,et al.  A computational theory of human stereo vision , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[47]  Pascal Fua,et al.  A parallel stereo algorithm that produces dense depth maps and preserves image features , 1993, Machine Vision and Applications.

[48]  Michael A. Penna The incremental approximation of nonrigid motion , 1994 .

[49]  Stéphane Christy,et al.  Euclidean Shape and Motion from Multiple Perspective Views by Affine Iterations , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Kiriakos N. Kutulakos,et al.  Multi-view 3D shape and motion recovery on the spatio-temporal curve manifold , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[51]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[52]  Dimitris N. Metaxas,et al.  Shape and Nonrigid Motion Estimation Through Physics-Based Synthesis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Shahriar Negahdaripour,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence , 2004 .

[54]  Richard Szeliski,et al.  Surface modeling with oriented particle systems , 1992, SIGGRAPH.

[55]  Takeo Kanade,et al.  Three-dimensional scene flow , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Zhengyou Zhang,et al.  Estimation of Displacements from Two 3-D Frames Obtained From Stereo , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Takeo Kanade,et al.  Spatio-Temporal View Interpolation , 2002, Rendering Techniques.

[58]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[59]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[60]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[61]  Steven M. Seitz,et al.  View morphing , 1996, SIGGRAPH.

[62]  Amnon Shashua,et al.  Image-based view synthesis by combining trilinear tensors and learning techniques , 1997, VRST '97.

[63]  Katsushi Ikeuchi,et al.  Object shape and reflectance modeling from observation , 1997, SIGGRAPH.

[64]  Takeo Kanade,et al.  Recovery of dynamic scene structure from multiple image sequences , 1996, 1996 IEEE/SICE/RSJ International Conference on Multisensor Fusion and Integration for Intelligent Systems (Cat. No.96TH8242).

[65]  Matthias Zwicker,et al.  Surface splatting , 2001, SIGGRAPH.

[66]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[67]  Takeo Kanade,et al.  Virtual ized reality: constructing time-varying virtual worlds from real world events , 1997 .

[68]  Michael G. Strintzis,et al.  Model-Based Joint Motion and Structure Estimation from Stereo Images , 1997, Comput. Vis. Image Underst..

[69]  Marc Levoy,et al.  Zippered polygon meshes from range images , 1994, SIGGRAPH.

[70]  Richard Szeliski,et al.  3-D Scene Data Recovery Using Omnidirectional Multibaseline Stereo , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[71]  Takeo Kanade,et al.  Constructing virtual worlds using dense stereo , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[72]  Takeo Kanade,et al.  Virtualized Reality : Digitizing a 3D Time-Varying Event As Is and in Real Time , 1999 .

[73]  Marc Levoy,et al.  Feature-based volume metamorphosis , 1995, SIGGRAPH.

[74]  Kiriakos N. Kutulakos,et al.  Multi-View Scene Capture by Surfel Sampling: From Video Streams to Non-Rigid 3D Motion, Shape and Reflectance , 2002, International Journal of Computer Vision.

[75]  Alexis Tsipras,et al.  Greece , 1940, Nature.

[76]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[77]  Robert T. Collins,et al.  A space-sweep approach to true multi-image matching , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[78]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[79]  Marc Alexa,et al.  As-rigid-as-possible shape interpolation , 2000, SIGGRAPH.

[80]  BakerSimon,et al.  Image-based spatio-temporal modeling and view interpolation of dynamic events , 2005 .

[81]  Takeo Kanade,et al.  A multiple-baseline stereo , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[82]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.