Depth from Combining Defocus and Correspondence Using Light-Field Cameras

Light-field cameras have recently become available to the consumer market. An array of micro-lenses captures enough information that one can refocus images after acquisition, as well as shift one's viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. Thus, depth cues from both defocus and correspondence are available simultaneously in a single capture. Previously, defocus could be achieved only through multiple image exposures focused at different depths, while correspondence cues needed multiple exposures at different viewpoints or multiple cameras, moreover, both cues could not easily be obtained together. In this paper, we present a novel simple and principled algorithm that computes dense depth estimation by combining both defocus and correspondence depth cues. We analyze the x-u 2D epipolar image (EPI), where by convention we assume the spatial x coordinate is horizontal and the angular u coordinate is vertical (our final algorithm uses the full 4D EPI). We show that defocus depth cues are obtained by computing the horizontal (spatial) variance after vertical (angular) integration, and correspondence depth cues by computing the vertical (angular) variance. We then show how to combine the two cues into a high quality depth map, suitable for computer vision applications such as matting, full control of depth-of-field, and surface reconstruction.

[1]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[2]  Yoav Y. Schechner,et al.  Depth from Defocus vs. Stereo: How Different Really Are They? , 2004, International Journal of Computer Vision.

[3]  Patrick Pérez,et al.  Geodesic image and video editing , 2010, TOGS.

[4]  Jonathan T. Barron,et al.  A category-level 3-D object dataset: Putting the Kinect to work , 2011, ICCV Workshops.

[5]  Marc Levoy,et al.  Reconstructing Occluded Surfaces Using Synthetic Apertures: Stereo, Focus and Robust Measures , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Brian A. Barsky,et al.  Depth of field postprocessing for layered scenes using constant-time rectangle spreading , 2009, Graphics Interface.

[7]  Shree K. Nayar,et al.  Rational Filters for Passive Depth from Defocus , 1998, International Journal of Computer Vision.

[8]  Anat Levin,et al.  Analyzing Depth from Coded Aperture Sets , 2010, ECCV.

[9]  Chia-Kai Liang,et al.  Programmable aperture photography: multiplexed light field acquisition , 2008, SIGGRAPH 2008.

[10]  Jitendra Malik,et al.  Occlusion boundary detection and figure/ground assignment from optical flow , 2011, CVPR 2011.

[11]  P. Hanrahan,et al.  Light Field Photography with a Hand-held Plenoptic Camera , 2005 .

[12]  Robert C. Bolles,et al.  Epipolar-plane image analysis: An approach to determining structure from motion , 1987, International Journal of Computer Vision.

[13]  Lennart Wietzke,et al.  Single lens 3D-camera with extended depth-of-field , 2012, Electronic Imaging.

[14]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[15]  Béla Ágai,et al.  CONDENSED 1,3,5-TRIAZEPINES - V THE SYNTHESIS OF PYRAZOLO [1,5-a] [1,3,5]-BENZOTRIAZEPINES , 1983 .

[16]  Richard Szeliski,et al.  Extracting layers and analyzing their specular properties using epipolar-plane-image analysis , 2005, Comput. Vis. Image Underst..

[17]  Murali Subbarao,et al.  Integration of Defocus and Focus Analysis with Stereofor 3 D Shape RecoveryMurali , 1997 .

[18]  Eric Q. Li,et al.  Bundled depth-map merging for multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Chi Liu,et al.  Programmable aperture photography: multiplexed light field acquisition , 2008, ACM Trans. Graph..

[20]  Martial Hebert,et al.  Occlusion Boundaries from Motion: Low-Level Detection and Mid-Level Reasoning , 2009, International Journal of Computer Vision.

[21]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[22]  Michael J. Black,et al.  Secrets of optical flow estimation and their principles , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Jonathan M. Garibaldi,et al.  Real-Time Correlation-Based Stereo Vision with Reduced Border Errors , 2002, International Journal of Computer Vision.

[24]  Edward H. Adelson,et al.  Single Lens Stereo with a Plenoptic Camera , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26]  Andrew Blake,et al.  Efficient Human Pose Estimation from Single Depth Images , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Yael Pritch,et al.  Scene reconstruction from high spatio-angular resolution light fields , 2013, ACM Trans. Graph..

[28]  Marc Pollefeys,et al.  Interactive 3D architectural modeling from unordered photo collections , 2008, SIGGRAPH 2008.

[29]  Minh N. Do,et al.  Joint Histogram-Based Cost Aggregation for Stereo Matching , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Sven Wanner,et al.  Globally consistent depth labeling of 4D light fields , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Wilson S. Geisler,et al.  Maximum-likelihood depth-from-defocus for active vision , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[32]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[33]  Tomáš Werner,et al.  Accurate Correspondences From Epipolar Plane Images , 2001 .

[34]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[35]  Takeo Kanade,et al.  A Multiple-Baseline Stereo , 1993, IEEE Trans. Pattern Anal. Mach. Intell..