360 Panorama Synthesis from a Sparse Set of Images on a Low-Power Device

A full 360<inline-formula><tex-math notation="LaTeX">$^\circ$</tex-math></inline-formula> × 180<inline-formula><tex-math notation="LaTeX">$^\circ$</tex-math></inline-formula> image provides an unlimited field of view (FOV) and an immersive experience for the users without any loss of information of the surrounding. In this study, a deep learning based approach is proposed to synthesize a 360<inline-formula><tex-math notation="LaTeX">$^\circ$</tex-math></inline-formula> image from a sparse set of images captured with a limited FOV. The proposed network consists of a cascade of the FOV estimation network and the panorama synthesis network. We propose a hierarchical generative network to synthesize high quality 360<inline-formula><tex-math notation="LaTeX">$^\circ$</tex-math></inline-formula> panorama images. The design of progressive multi-scale generator and multiple discriminator reduces the high frequency artifact which is commonly observed in image synthesis using generative networks. The network is further compressed to run on a low-power device such as a smartphone using our proposed size and latency optimization. Experimental result demonstrates that the proposed method produces 360<inline-formula><tex-math notation="LaTeX">$^\circ$</tex-math></inline-formula> panorama with satisfactory image quality of up to 512 × 1024 resolution. It is also shown that the proposed method outperforms the alternative method and can be generalized for non-panoramic scenes and real images captured by a smartphone camera.

[1]  Jiaya Jia,et al.  Wide-Context Semantic Image Extrapolation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[4]  Krista A. Ehinger,et al.  Recognizing scene viewpoint using panoramic place representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Ali Borji,et al.  Cross-View Image Synthesis Using Conditional GANs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[8]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[9]  Thomas Brox,et al.  Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Nipun Kwatra,et al.  Texture optimization for example-based synthesis , 2005, ACM Trans. Graph..

[11]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[12]  Connor Greenwell,et al.  DEEPFOCAL: A method for direct focal length estimation , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[13]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[15]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[16]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[17]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[18]  In Kyu Park,et al.  360 Panorama Synthesis from a Sparse Set of Images with Unknown Field of View , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[20]  Ivan V. Oseledets,et al.  Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition , 2014, ICLR.

[21]  Guillermo Sapiro,et al.  Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[22]  Luc Van Gool,et al.  What is Around the Camera? , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Bas Boom,et al.  Conditional Transfer with Dense Residual Attention: Synthesizing traffic signs from street-view imagery , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[24]  Yinda Zhang,et al.  FrameBreak: Dramatic Image Extrapolation by Guided Shift-Maps , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Junmo Kim,et al.  A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[28]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[30]  Suha Kwak,et al.  Learning Pixel-Level Semantic Affinity with Image-Level Supervision for Weakly Supervised Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Jiajun Wu,et al.  Synthesizing 3D Shapes via Modeling Multi-view Depth Maps and Silhouettes with Deep Generative Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Eunhyeok Park,et al.  Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications , 2015, ICLR.

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[36]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[37]  Pritish Narayanan,et al.  Deep Learning with Limited Numerical Precision , 2015, ICML.

[38]  H. Intraub,et al.  Wide-angle memories of close-up scenes. , 1989, Journal of experimental psychology. Learning, memory, and cognition.

[39]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Thomas Brox,et al.  Multi-view 3D Models from Single Images with a Convolutional Network , 2015, ECCV.

[42]  Miodrag Potkonjak,et al.  Pruning Filters and Classes: Towards On-Device Customization of Convolutional Neural Networks , 2017, EMDL '17.