Natural video matting using camera arrays

We present an algorithm and a system for high-quality natural video matting using a camera array. The system uses high frequencies present in natural scenes to compute mattes by creating a synthetic aperture image that is focused on the foreground object, which reduces the variance of pixels reprojected from the foreground while increasing the variance of pixels reprojected from the background. We modify the standard matting equation to work directly with variance measurements and show how these statistics can be used to construct a trimap that is later upgraded to an alpha matte. The entire process is completely automatic, including an automatic method for focusing the synthetic aperture image on the foreground object and an automatic method to compute the trimap and the alpha matte. The proposed algorithm is very efficient and has a per-pixel running time that is linear in the number of cameras. Our current system runs at several frames per second, and we believe that it is the first system capable of computing high-quality alpha mattes at near real-time rates without the use of active illumination or special backgrounds.

[1]  David Salesin,et al.  Environment matting and compositing , 1999, SIGGRAPH.

[2]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[3]  Andrew W. Fitzgibbon,et al.  Bayesian Estimation of Layers from Multiple Images , 2002, ECCV.

[4]  David Salesin,et al.  Environment matting extensions: towards higher accuracy and real-time capture , 2000, SIGGRAPH.

[5]  Frédo Durand,et al.  Defocus video matting , 2005, SIGGRAPH 2005.

[6]  Jian Sun,et al.  Video object cut and paste , 2005, SIGGRAPH 2005.

[7]  Carlo Tomasi,et al.  Alpha estimation in natural images , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8]  David Salesin,et al.  A Bayesian approach to digital matting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  Andrew Blake,et al.  Bi-layer segmentation of binocular stereo video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  James F. Blinn,et al.  Blue screen matting , 1996, SIGGRAPH.

[11]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[12]  Jiaya Jia,et al.  Poisson matting , 2004, SIGGRAPH 2004.

[13]  Maneesh Agrawala,et al.  Interactive video cutout , 2005, SIGGRAPH 2005.

[14]  David Salesin,et al.  Video matting of complex scenes , 2002, SIGGRAPH.

[15]  Leonard McMillan,et al.  Dynamically reparameterized light fields , 2000, SIGGRAPH.

[16]  Jian Sun,et al.  Lazy snapping , 2004, SIGGRAPH 2004.

[17]  Marc Levoy,et al.  High performance imaging using large camera arrays , 2005, SIGGRAPH 2005.

[18]  Vladimir Kolmogorov,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..