Light-field view synthesis using convolutional block attention module

Consumer light-field (LF) cameras suffer from a low or limited resolution because of the angular-spatial trade-off. To alleviate this drawback, we propose a novel learning-based approach utilizing attention mechanism to synthesize novel views of a light-field image using a sparse set of input views (i.e., 4 corner views) from a camera array. In the proposed method, we divide the process into three stages, stereo-feature extraction, disparity estimation, and final image refinement. We use three sequential convolutional neural networks for each stage. A residual convolutional block attention module (CBAM) is employed for final adaptive image refinement. Attention modules are helpful in learning and focusing more on the important features of the image and are thus sequentially applied in the channel and spatial dimensions. Experimental results show the robustness of the proposed method. Our proposed network outperforms the state-of-the-art learning-based light-field view synthesis methods on two challenging real-world datasets by 0.5 dB on average. Furthermore, we provide an ablation study to substantiate our findings.

[1]  Frédo Durand,et al.  Linear view synthesis using a dimensionality gap light field prior , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Ravi Ramamoorthi,et al.  Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines , 2019 .

[3]  Xiaoming Chen,et al.  Fast Light Field Reconstruction with Deep Coarse-to-Fine Modeling of Spatial-Angular Clues , 2018, ECCV.

[4]  Frédo Durand,et al.  Light Field Reconstruction Using Sparsity in the Continuous Fourier Domain , 2014, ACM Trans. Graph..

[5]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ting-Chun Wang,et al.  Learning-based view synthesis for light field cameras , 2016, ACM Trans. Graph..

[7]  Christian Riess,et al.  Toward Bridging the Simulated-to-Real Gap: Benchmarking Super-Resolution on Real Data , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Ravi Ramamoorthi,et al.  Learning to Synthesize a 4D RGBD Light Field from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  In-So Kweon,et al.  Light-Field Image Super-Resolution Using Convolutional Neural Network , 2017, IEEE Signal Processing Letters.

[10]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[11]  Gordon Wetzstein,et al.  Compressive light field photography using overcomplete dictionaries and optimized projections , 2013, ACM Trans. Graph..

[12]  P. Hanrahan,et al.  Light Field Photography with a Hand-held Plenoptic Camera , 2005 .

[13]  Joachim Keinert,et al.  A Benchmark of Light Field View Interpolation Methods , 2020, 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[14]  Marc Levoy,et al.  High performance imaging using large camera arrays , 2005, ACM Trans. Graph..

[15]  Neus Sabater,et al.  Learning occlusion-aware view synthesis for light fields , 2019, Pattern Analysis and Applications.

[16]  Bahadir K. Gunturk,et al.  Spatial and Angular Resolution Enhancement of Light Fields Using Convolutional Neural Networks , 2017, IEEE Transactions on Image Processing.

[17]  Sam Kwong,et al.  Learning Light Field Angular Super-Resolution via a Geometry-Aware Network , 2020, AAAI.

[18]  Joachim Keinert,et al.  A High-Resolution High Dynamic Range Light-Field Dataset with an Application to View Synthesis and Tone-Mapping , 2020, 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[19]  Clemens Birklbauer,et al.  Optimized sampling for view interpolation in light fields using local dictionaries , 2017, Comput. Vis. Image Underst..

[20]  Lu Liu,et al.  DCM-CNN: Densely Connected Multiloss Convolutional Neural Networks for Light Field View Synthesis , 2020, IEEE Access.

[21]  Paul Debevec,et al.  DeepView: View Synthesis With Learned Gradient Descent , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Tieniu Tan,et al.  End-to-End View Synthesis for Light Field Imaging with Pseudo 4DCNN , 2018, ECCV.

[23]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Joachim Keinert,et al.  Pixel-Wise Confidences for Stereo Disparities Using Recurrent Neural Networks , 2019, BMVC.

[25]  Clemens Birklbauer,et al.  Directional Super-Resolution by Means of Coded Sampling and Guided Upsampling , 2015, 2015 IEEE International Conference on Computational Photography (ICCP).

[26]  Masaaki Ikehara,et al.  Residual Learning of Video Frame Interpolation Using Convolutional LSTM , 2020, IEEE Access.

[27]  Stefano Mattoccia,et al.  Guided Stereo Matching , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Michael R. Lyu,et al.  SelFlow: Self-Supervised Learning of Optical Flow , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Graham Fyffe,et al.  Stereo Magnification: Learning View Synthesis using Multiplane Images , 2018, ArXiv.