Defocus video matting

Video matting is the process of pulling a high-quality alpha matte and foreground from a video sequence. Current techniques require either a known background (e.g., a blue screen) or extensive user interaction (e.g., to specify known foreground and background elements). The matting problem is generally under-constrained, since not enough information has been collected at capture time. We propose a novel, fully autonomous method for pulling a matte using multiple synchronized video streams that share a point of view but differ in their plane of focus. The solution is obtained by directly minimizing the error in filter-based image formation equations, which are over-constrained by our rich data stream. Our system solves the fully dynamic video matting problem without user assistance: both the foreground and background may be high frequency and have dynamic content, the foreground may resemble the background, and the scene is lit by natural (as opposed to polarized or collimated) illumination.

[1]  Ronen Basri,et al.  Separation of Transparent Layers using Focus , 2004, International Journal of Computer Vision.

[2]  Henrique S. Malvar,et al.  High-quality linear interpolation for demosaicing of Bayer-patterned color images , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Andrew S. Glassner,et al.  Principles of Digital Image Synthesis , 1995 .

[4]  Stefano Soatto,et al.  Seeing beyond occlusions (and other marvels of a finite lens aperture) , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  David Salesin,et al.  A Bayesian approach to digital matting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[6]  Jian Sun,et al.  Poisson matting , 2004, ACM Trans. Graph..

[7]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[8]  James F. Blinn,et al.  Blue screen matting , 1996, SIGGRAPH.

[9]  Carlo Tomasi,et al.  Alpha estimation in natural images , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[10]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[11]  Shree K. Nayar,et al.  Real-time focus range sensor , 1995, Proceedings of IEEE International Conference on Computer Vision.

[12]  John M. Hannah,et al.  Alpha channel estimation in high resolution images and image sequences , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[13]  Shree K. Nayar,et al.  Adaptive dynamic range imaging: optical control of pixel exposures over space and time , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Andrew W. Fitzgibbon,et al.  Bayesian video matting using learnt image priors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[15]  Tom Duff,et al.  Compositing digital images , 1984, SIGGRAPH.

[16]  Michael Potmesil,et al.  Modeling motion blur in computer-generated images , 1983, SIGGRAPH.

[17]  Shree K. Nayar,et al.  Jitter camera: high resolution video from a low resolution detector , 2004, CVPR 2004.

[18]  Naoki Asada,et al.  Seeing Behind the Scene: Analysis of Photometric Properties of Occluding Edges by the Reversed Projection Blurring Model , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Xinhua Zhuang,et al.  Image Analysis Using Mathematical Morphology , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Patrick Pérez,et al.  Interactive Image Segmentation Using an Adaptive GMMRF Model , 2004, ECCV.

[21]  Subhasis Chaudhuri,et al.  Depth From Defocus in Presence of Partial Self Occlusion , 2001, ICCV.

[22]  Alex Pentland,et al.  A New Sense for Depth of Field , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Marc Levoy,et al.  Synthetic aperture confocal imaging , 2004, ACM Trans. Graph..

[24]  David Salesin,et al.  Video matting of complex scenes , 2002, SIGGRAPH.

[25]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..