Temporally-coherent Novel Video Synthesis Using Texture-based Priors

In this paper we propose a method to construct a virtual sequence for a camera moving through a static environment, given an input sequence from a different camera trajectory. Existing image-based rendering techniques can generate photorealistic images given a set of input views, though the output images almost unavoidably contain small regions where the colour has been incorrectly chosen. In a single image these artifacts are often hard to spot, but become more obvious when viewing a real image with its virtual stereo pair, and even more so when a sequence of novel views is generated, since the artifacts are rarely temporally consistent. To address this problem of consistency, we propose a new spatio-temporal approach to novel video synthesis. Our method exploits epipolar geometry to impose constraints on temporal coherence of the rendered views. The pixels in the output video sequence are modelled as nodes of a 3-D graph. We define an MRF on the graph which encodes photoconsistency of pixels as well as texture priors in both space and time. Unlike methods based on scene geometry, which yield highly connected graphs, our approach results in a graph whose degree is independent of scene structure. The MRF energy is therefore tractable and we solve it for the whole sequence using a state-of-the-art message passing optimisation algorithm. We demonstrate the effectiveness of our approach in reducing temporal artifacts.

[1]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[2]  Michael Goesele,et al.  Multi-View Stereo Revisited , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Pau Gargallo,et al.  Bayesian 3D modeling from images using multiple depth maps , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Brian Potetz,et al.  Efficient Belief Propagation for Vision Using Linear Constraint Nodes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Takeo Kanade,et al.  A Multiple-Baseline Stereo , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Vladimir Kolmogorov,et al.  Multi-camera Scene Reconstruction via Graph Cuts , 2002, ECCV.

[8]  Luc Van Gool,et al.  Combined Depth and Outlier Estimation in Multi-View Stereo , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Vladimir Kolmogorov,et al.  Comparison of Energy Minimization Algorithms for Highly Connected Graphs , 2006, ECCV.

[10]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Andrew W. Fitzgibbon,et al.  Image-Based Rendering Using Image-Based Priors , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  William T. Freeman,et al.  Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  Andrew W. Fitzgibbon,et al.  Efficient new-view synthesis using pairwise dictionary priors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.