Fast and Accurate Depth Estimation From Sparse Light Fields

We present a fast and accurate method for dense depth reconstruction, which is specifically tailored to process sparse, wide-baseline light field data captured with camera arrays. In our method, the source images are over-segmented into non-overlapping compact superpixels. We model superpixel as planar patches in the image space and use them as basic primitives for depth estimation. Such superpixel-based representation yields desired reduction in both memory and computation requirements while preserving image geometry with respect to the object contours. The initial depth maps, obtained by plane-sweeping independently for each view, are jointly refined via iterative belief-propagation-like optimization in superpixel domain. During the optimization, smoothness between the neighboring superpixels and geometric consistency between the views are enforced. To ensure rapid information propagation into textureless and occluded regions, together with the immediate superpixel neighbors, candidates from larger neighborhoods are sampled. Additionally, in order to make full use of the parallel graphics hardware a synchronous message update schedule is employed allowing to process all the superpixels of all the images at once. This way, the distribution of the scene geometry becomes distinctive already after the first iterations, facilitating stability and fast convergence of the refinement procedure. We demonstrate that a few refinement iterations result in globally consistent dense depth maps even in the presence of wide textureless regions and occlusions. The experiments show that while the depth reconstruction takes about a second per full high-definition view, the accuracy of the obtained depth maps is comparable with the state-of-the-art results, which otherwise require much longer processing time.

[1]  Michael J. Black,et al.  Towards Probabilistic Volumetric Reconstruction Using Ray Potentials , 2015, 2015 International Conference on 3D Vision.

[2]  Anthony Steed,et al.  A Surround Video Capture and Presentation System for Preservation of Eye-Gaze in Teleconferencing Applications , 2015, PRESENCE: Teleoperators and Virtual Environments.

[3]  Jie Chen,et al.  Accurate Light Field Depth Estimation With Superpixel Regularization Over Partially Occluded Regions , 2017, IEEE Transactions on Image Processing.

[4]  Pedro F. Felzenszwalb,et al.  Efficient belief propagation for early vision , 2004, CVPR 2004.

[5]  In Kyu Park,et al.  Robust Light Field Depth Estimation Using Occlusion-Noise Aware Data Costs , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Vladimir Kolmogorov,et al.  Multi-camera Scene Reconstruction via Graph Cuts , 2002, ECCV.

[7]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  H. Hirschmüller Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Stereo Processing by Semi-global Matching and Mutual Information , 2022 .

[10]  Xukun Shen,et al.  PM-PM: PatchMatch With Potts Model for Object Segmentation and Stereo Matching , 2015, IEEE Transactions on Image Processing.

[11]  Richard Szeliski,et al.  Extracting layers and analyzing their specular properties using epipolar-plane-image analysis , 2005, Comput. Vis. Image Underst..

[12]  Alois Knoll,et al.  PM-Huber: PatchMatch with Huber Regularization for Stereo Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[14]  Christine Guillemot,et al.  Depth Estimation with Occlusion Handling from a Sparse Set of Light Field Views , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[15]  Bastian Leibe,et al.  Superpixels: An evaluation of the state-of-the-art , 2016, Comput. Vis. Image Underst..

[16]  Robert Bregovic,et al.  Light Field Reconstruction Using Shearlet Transform , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Xiaoming Chen,et al.  Fast Light Field Reconstruction with Deep Coarse-to-Fine Modeling of Spatial-Angular Clues , 2018, ECCV.

[18]  Neus Sabater,et al.  Dataset and Pipeline for Multi-view Light-Field Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[20]  Sven Wanner,et al.  Globally consistent depth labeling of 4D light fields , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Sanja Fidler,et al.  Real-time coarse-to-fine topologically preserving segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  P. Hanrahan,et al.  Light Field Photography with a Hand-held Plenoptic Camera , 2005 .

[24]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Yael Pritch,et al.  Scene reconstruction from high spatio-angular resolution light fields , 2013, ACM Trans. Graph..

[26]  Wei An,et al.  Toward Real-World Light Field Depth Estimation: A Noise-Aware Paradigm Using Multi-Stereo Disparity Integration , 2019, IEEE Access.

[27]  Konrad Schindler,et al.  Massively Parallel Multiview Stereopsis by Surface Normal Diffusion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Marc Levoy,et al.  High performance imaging using large camera arrays , 2005, ACM Trans. Graph..

[29]  Robert C. Bolles,et al.  Epipolar-plane image analysis: An approach to determining structure from motion , 1987, International Journal of Computer Vision.

[30]  Hongyang Chao,et al.  As-Rigid-As-Possible Stereo under Second Order Smoothness Priors , 2014, ECCV.

[31]  Qionghai Dai,et al.  Light Field Image Processing: An Overview , 2017, IEEE Journal of Selected Topics in Signal Processing.

[32]  Hendrik P. A. Lensch,et al.  Multi-View Depth Map Estimation With Cross-View Consistency , 2014, BMVC.

[33]  Zhou Wang,et al.  Image Quality Assessment: From Error Measurement to Structural Similarity , 2004 .

[34]  Robert T. Collins,et al.  A space-sweep approach to true multi-image matching , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35]  Bernd Jähne,et al.  Trust your Model: Light Field Depth Estimation with Inline Occlusion Handling , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Lennart Wietzke,et al.  Single lens 3D-camera with extended depth-of-field , 2012, Electronic Imaging.

[37]  Jana Kosecka,et al.  Piecewise planar city 3D modeling from street view panoramic sequences , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Ian Reid,et al.  gSLIC: a real-time implementation of SLIC superpixel segmentation , 2011 .

[39]  Carsten Rother,et al.  PatchMatch Stereo - Stereo Matching with Slanted Support Windows , 2011, BMVC.

[40]  Hujun Bao,et al.  Consistent Depth Maps Recovery from a Video Sequence , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Jan-Michael Frahm,et al.  Real-Time Plane-Sweeping Stereo with Multiple Sweeping Directions , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  András Bódis-Szomorú,et al.  Fast, Approximate Piecewise-Planar Modeling Based on Sparse Structure-from-Motion and Superpixels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Yuichi Taguchi,et al.  Stereo reconstruction with mixed pixels using adaptive over-segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Harry Shum,et al.  Plenoptic sampling , 2000, SIGGRAPH.

[45]  Andrew W. Fitzgibbon,et al.  PMBP: PatchMatch Belief Propagation for Correspondence Field Estimation , 2014, International Journal of Computer Vision.

[46]  Jan-Michael Frahm,et al.  Piecewise planar and non-planar stereo for urban scene reconstruction , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47]  Sven Wanner,et al.  Variational Light Field Analysis for Disparity Estimation and Super-Resolution , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  View Synthesis Reference Software (VSRS) 4.2 with improved inpainting and hole filing , 2017 .

[49]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[50]  Jan-Michael Frahm,et al.  Real-Time Visibility-Based Fusion of Depth Maps , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[51]  Sang Uk Lee,et al.  GPU-friendly multi-view stereo reconstruction using surfel representation and graph cuts , 2011, Comput. Vis. Image Underst..

[52]  Frédo Durand,et al.  Unstructured Light Fields , 2012, Comput. Graph. Forum.

[53]  Hans-Peter Seidel,et al.  Efficient Multi‐image Correspondences for On‐line Light Field Video Processing , 2016, Comput. Graph. Forum.

[54]  Qionghai Dai,et al.  A Real Time Interactive Dynamic Light Field Transmission System , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[55]  Jitendra Malik,et al.  Depth from Combining Defocus and Correspondence Using Light-Field Cameras , 2013, 2013 IEEE International Conference on Computer Vision.

[56]  Carlos Hernandez,et al.  Multi-View Stereo: A Tutorial , 2015, Found. Trends Comput. Graph. Vis..

[57]  Neus Sabater,et al.  Superrays for Efficient Light Field Processing , 2017, IEEE Journal of Selected Topics in Signal Processing.

[58]  In-So Kweon,et al.  A Taxonomy and Evaluation of Dense Light Field Depth Estimation Algorithms , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[59]  Alexei A. Efros,et al.  Occlusion-Aware Depth Estimation Using Light-Field Cameras , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[60]  Changil Kim 3D Reconstruction and Rendering from High Resolution Light Fields , 2015 .

[61]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Richard Szeliski,et al.  Piecewise planar stereo for image-based rendering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[63]  Olga Sorkine-Hornung,et al.  Depth from Gradients in Dense Light Fields for Object Reconstruction , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[64]  Cagri Ozcinar,et al.  A Study of Light Field Streaming for An Interactive Refocusing Application , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[65]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[66]  Minh N. Do,et al.  Patch Match Filter: Efficient Edge-Aware Filtering Meets Randomized Search for Fast Correspondence Field Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.