Learning Light Field Synthesis with Multi-Plane Images: Scene Encoding as a Recurrent Segmentation Task

In this paper we address the problem of view synthesis from large baseline light fields, by turning a sparse set of input views into a Multi-plane Image (MPI). Because available datasets are scarce, we propose a lightweight network that does not require extensive training. Unlike latest approaches, our model does not learn to estimate RGB layers but only encodes the scene geometry within MPI alpha layers, which comes down to a segmentation task. A Learned Gradient Descent (LGD) framework is used to cascade the same convolutional network in a recurrent fashion in order to refine the volumetric representation obtained. Thanks to its low number of parameters, our model trains successfully on a small light field video dataset and provides visually appealing results. It also exhibits convenient generalization properties regarding both the number of input views, the number of depth planes in the MPI, and the number of refinement iterations.

[1]  Ravi Ramamoorthi,et al.  Local light field fusion , 2019, ACM Trans. Graph..

[2]  Ting-Chun Wang,et al.  Learning-based view synthesis for light field cameras , 2016, ACM Trans. Graph..

[3]  Y. Oshman,et al.  Averaging Quaternions , 2007 .

[4]  Richard Szeliski,et al.  Layered depth images , 1998, SIGGRAPH.

[5]  Neus Sabater,et al.  Long Short Term Memory Networks for Light Field View Synthesis , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[6]  Jonathan T. Barron,et al.  Pushing the Boundaries of View Extrapolation With Multiplane Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ravi Ramamoorthi,et al.  Learning to Synthesize a 4D RGBD Light Field from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Tom Duff,et al.  Compositing digital images , 1984, SIGGRAPH.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  John Flynn,et al.  Stereo magnification , 2018, ACM Trans. Graph..

[11]  John Flynn,et al.  Deep Stereo: Learning to Predict New Views from the World's Imagery , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Richard Szeliski,et al.  Stereo Matching with Transparency and Matting , 1999, International Journal of Computer Vision.

[13]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[15]  Paul Debevec,et al.  DeepView: View Synthesis With Learned Gradient Descent , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jonas Adler,et al.  Solving ill-posed inverse problems using iterative deep neural networks , 2017, ArXiv.

[17]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[18]  Neus Sabater,et al.  Dataset and Pipeline for Multi-view Light-Field Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).