Benchmarking of several disparity estimation algorithms for light field processing

A number of high-quality depth imaged-based rendering (DIBR) pipelines have been developed to reconstruct a 3D scene from several images taken from known camera viewpoints. Due to the specific limitations of each technique, their output is prone to artifacts. Therefore, the quality cannot be ensured. To improve the quality of the most critical and challenging image areas, an exhaustive comparison is required. In this paper, we consider three questions of benchmarking the quality performance of eight DIBR techniques on light fields: First, how does the density of original input views affect the quality of the rendered novel views? Second, how does disparity range between adjacent input views impact the quality? Third, how does each technique behave for different object properties? We compared and evaluated the results visually as well as quantitatively (PSNR, SSIM, AD, and VDP2). The results show some techniques outperform others in different disparity ranges. The results also indicate using more views not necessarily results in visually higher quality for all critical image areas. Finally, we have shown a comparison for different scene’s complexity such as non-Lambertian objects.

[1]  C. Dyer Volumetric Scene Reconstruction from Multiple Views , 2001 .

[2]  András Bódis-Szomorú,et al.  Fast, Approximate Piecewise-Planar Modeling Based on Sparse Structure-from-Motion and Superpixels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Robert Bregovic,et al.  Light Field Reconstruction Using Shearlet Transform , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Klaus Mueller,et al.  Volume Graphics 2001 , 2001, Eurographics.

[5]  Harry Shum,et al.  Review of image-based rendering techniques , 2000, Visual Communications and Image Processing.

[6]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[7]  Robert C. Bolles,et al.  Epipolar-plane image analysis: An approach to determining structure from motion , 1987, International Journal of Computer Vision.

[8]  Thomas Malzbender,et al.  A Survey of Methods for Volumetric Scene Reconstruction from Photographs , 2001, VG.

[9]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[11]  Joachim Keinert,et al.  Acquisition system for dense lightfield of large scenes , 2017, 2017 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON).

[12]  Hongyang Chao,et al.  MeshStereo: A Global Stereo Model with Mesh Alignment Regularization for View Interpolation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Aljoscha Smolic,et al.  Intermediate view interpolation based on multiview video plus depth for advanced 3D video systems , 2008, 2008 15th IEEE International Conference on Image Processing.

[14]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[15]  Robert P. W. Duin,et al.  Fast percentile filtering , 1986, Pattern Recognit. Lett..

[16]  Xing Mei,et al.  On building an accurate stereo matching system on graphics hardware , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[17]  Wolfgang Heidrich,et al.  HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions , 2011, ACM Trans. Graph..