Interactive Pixel‐Accurate Free Viewpoint Rendering from Images with Silhouette Aware Sampling

We present an integrated, fully GPU‐based processing pipeline to interactively render new views of arbitrary scenes from calibrated but otherwise unstructured input views. In a two‐step procedure, our method first generates for each input view a dense proxy of the scene using a new multi‐view stereo formulation. Each scene proxy consists of a structured cloud of feature aware particles which automatically have their image space footprints aligned to depth discontinuities of the scene geometry and hence effectively handle sharp object boundaries and occlusions. We propose a particle optimization routine combined with a special parameterization of the view space that enables an efficient proxy generation as well as robust and intuitive filter operators for noise and outlier removal. Moreover, our generic proxy generation allows us to flexibly handle scene complexities ranging from small objects up to complete outdoor scenes. The second phase of the algorithm combines these particle clouds in real‐time into a view‐dependent proxy for the desired output view and performs a pixel‐accurate accumulation of the colour contributions from each available input view. This makes it possible to reconstruct even fine‐scale view‐dependent illumination effects. We demonstrate how all these processing stages of the pipeline can be implemented entirely on the GPU with memory efficient, scalable data structures for maximum performance. This allows us to generate new output renderings of high visual quality from input images in real‐time.

[1]  Takeo Kanade,et al.  Spatio-Temporal View Interpolation , 2002, Rendering Techniques.

[2]  Leif Kobbelt,et al.  Robust and Efficient Photo-Consistency Estimation for Volumetric 3D Reconstruction , 2006, ECCV.

[3]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[4]  Pere-Pau Vázquez,et al.  Omni‐directional Relief Impostors , 2007, Comput. Graph. Forum.

[5]  Wojciech Matusik,et al.  3D TV , 2004, SIGGRAPH '04.

[6]  Matthias Zwicker,et al.  High-quality surface splatting on today's GPUs , 2005, Proceedings Eurographics/IEEE VGTC Symposium Point-Based Graphics, 2005..

[7]  Michael Wimmer,et al.  Light Space Perspective Shadow Maps , 2004, Rendering Techniques.

[8]  Richard Szeliski,et al.  Layered depth images , 1998, SIGGRAPH.

[9]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[10]  Jonathan M. Garibaldi,et al.  Real-Time Correlation-Based Stereo Vision with Reduced Border Errors , 2002, International Journal of Computer Vision.

[11]  Luc Van Gool,et al.  View synthesis by the parallel use of GPU and CPU , 2007, Image Vis. Comput..

[12]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.

[13]  Ivan Viola,et al.  Hardware-based nonlinear filtering and segmentation using high-level shading languages , 2003, IEEE Visualization, 2003. VIS 2003..

[14]  Andrew W. Fitzgibbon,et al.  Image-Based Rendering Using Image-Based Priors , 2005, International Journal of Computer Vision.

[15]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[16]  In-So Kweon,et al.  Adaptive Support-Weight Approach for Correspondence Search , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[18]  Markus H. Gross,et al.  Point-sampled 3D video of real-world scenes , 2007, Signal Process. Image Commun..

[19]  Linda G. Shapiro,et al.  View-base Rendering: Visualizing Real Objects from Scanned Range and Color Data , 1997, Rendering Techniques.

[20]  Markus H. Gross,et al.  3D video fragments: dynamic point samples for real-time free-viewpoint video , 2004, Comput. Graph..

[21]  Markus H. Gross,et al.  3D Video Billboard Clouds , 2007, Comput. Graph. Forum.

[22]  William H. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[23]  Wojciech Matusik,et al.  3D TV: a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes , 2004, ACM Trans. Graph..

[24]  Anita Sellent,et al.  Floating Textures , 2008, Comput. Graph. Forum.

[25]  Harry Shum,et al.  A Geometric Analysis of Light Field Rendering , 2004, International Journal of Computer Vision.

[26]  Steven M. Seitz,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[27]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[28]  Jan-Michael Frahm,et al.  Real-Time Visibility-Based Fusion of Depth Maps , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[29]  Paul A. Beardsley,et al.  Sequential Updating of Projective and Affine Structure from Motion , 1997, International Journal of Computer Vision.

[30]  Ruigang Yang,et al.  A Unified Approach To Real-Time, Multi-Resolution, Multi-Baseline 2d View Synthesis And 3d Depth Estimation Using Commodity Graphics Hardware , 2004, Int. J. Image Graph..

[31]  Hans-Peter Seidel,et al.  Hardware‐Accelerated Rendering of Photo Hulls , 2004, Comput. Graph. Forum.

[32]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[33]  Hujun Bao,et al.  Recovering consistent video depth maps via bundle optimization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Harry Shum,et al.  Plenoptic sampling , 2000, SIGGRAPH.

[35]  Leif Kobbelt,et al.  A Surface-Growing Approach to Multi-View Stereo Reconstruction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Richard Szeliski,et al.  Handling occlusions in dense multi-view stereo , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.