Panoramic Image Generation: From 2-D Sketch to Spherical Image

The 360-degree video/image, also called an omnidirectional video/image or panoramic video/image, is very important in some emerging areas such as virtual reality (VR). Therefore, corresponding image generation algorithms are urgently needed. However, existing image generation models mainly focus on 2-D images and do not consider the spherical structures of panoramic images. In this article, we propose a panoramic image generation method based on spherical convolution and generative adversarial networks, called spherical generative adversarial networks (SGANs). We adopt the sketch map as the input, which is a concise geometric structure representation of the panoramic image, e.g., comprising approximately 7% of the pixels for a 583 × 1163 image. Through adversarial learning, a realistic-looking, plausible and high-fidelity spherical image can be obtained from the sparse sketch map. In particular, we build a dataset of the sketch maps using a visual computation-based sketching model. Then, by optimizing SGANs with GAN loss, feature matching loss and perceptual loss, realistic textures and details are recovered gradually. On one hand, it is an improvement using the sparse sketch map as input rather than the denser input, e.g., the features of the textures and colors. On the other hand, spherical convolution helps to remedy space-varying distortions of the planar projection. We conduct extensive experiments on some public panoramic image datasets and compare them with state-of-the-art techniques to validate the superior performance of the proposed approach.

[1]  Zhang Lin Image recovery based on compressive sensing and Curvelet transform via ROMP , 2012, 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery.

[2]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[3]  Xiaoming Tao,et al.  Saliency Prediction on Omnidirectional Image With Generative Adversarial Imitation Learning , 2019, IEEE Transactions on Image Processing.

[4]  Yann LeCun,et al.  Energy-based Generative Adversarial Network , 2016, ICLR.

[5]  Ali Cafer Gürbüz,et al.  SAR image reconstruction by expectation maximization based matching pursuit , 2015, Digit. Signal Process..

[6]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[7]  S.M. Elshoura,et al.  Analysis of noise sensitivity and reconstruction accuracy of Tchebichef moments , 2008, IEEE SoutheastCon 2008.

[8]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[9]  Zulin Wang,et al.  Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[11]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[12]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[14]  Karen O. Egiazarian,et al.  Compressed Sensing Image Reconstruction Via Recursive Spatially Adaptive Filtering , 2007, ICIP.

[15]  Stéphane Mallat,et al.  Solving Inverse Problems With Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity , 2010, IEEE Transactions on Image Processing.

[16]  Yunsong Li,et al.  Hyperspectral image reconstruction by deep convolutional neural network for classification , 2017, Pattern Recognit..

[17]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[18]  Lei Zhang,et al.  Nonlocally Centralized Sparse Representation for Image Restoration , 2013, IEEE Transactions on Image Processing.

[19]  Gregory K. Wallace,et al.  The JPEG still picture compression standard , 1991, CACM.

[20]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[21]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[23]  R. DeVore,et al.  Compressed sensing and best k-term approximation , 2008 .

[24]  Zulin Wang,et al.  Assessing Visual Quality of Omnidirectional Videos , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Max Welling,et al.  Spherical CNNs , 2018, ICLR.

[26]  Michel Barlaud,et al.  Image coding using wavelet transform , 1992, IEEE Trans. Image Process..

[27]  Naokazu Yokoya,et al.  Generation of high-resolution stereo panoramic images by omnidirectional imaging sensor using hexagonal pyramidal mirrors , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[28]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[30]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[31]  Yücel Altunbasak,et al.  Super-resolution reconstruction of compressed video using transform-domain statistics , 2004, IEEE Transactions on Image Processing.

[32]  Lei Zhang,et al.  Image Deblurring and Super-Resolution by Adaptive Sparse Domain Selection and Adaptive Regularization , 2010, IEEE Transactions on Image Processing.

[33]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[34]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[35]  Guanghui Wang,et al.  Adversarially Approximated Autoencoder for Image Generation and Manipulation , 2019, IEEE Transactions on Multimedia.

[36]  Thong T. Do,et al.  Sparsity adaptive matching pursuit algorithm for practical compressed sensing , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[37]  Nir Shavit,et al.  Generative Compression , 2017, 2018 Picture Coding Symposium (PCS).

[38]  Thomas Brox,et al.  Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[39]  Wotao Yin,et al.  Bregman Iterative Algorithms for (cid:2) 1 -Minimization with Applications to Compressed Sensing ∗ , 2008 .

[40]  Lei Zhang,et al.  Centralized sparse representation for image restoration , 2011, 2011 International Conference on Computer Vision.

[41]  Fang Liu,et al.  Local Maximal Homogeneous Region Search for SAR Speckle Reduction With Sketch-Based Geometrical Kernel Function , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[42]  Bin Dong,et al.  Fast Linearized Bregman Iteration for Compressive Sensing and Sparse Denoising , 2011, ArXiv.

[43]  Meenakshi,et al.  Image reconstruction using modified orthogonal matching pursuit and compressive sensing , 2015, International Conference on Computing, Communication & Automation.

[44]  T. Charles Clancy,et al.  Over-the-Air Deep Learning Based Radio Signal Classification , 2017, IEEE Journal of Selected Topics in Signal Processing.

[45]  Mark A. Anastasio,et al.  Deep Learning-Guided Image Reconstruction from Incomplete Data , 2017, ArXiv.

[46]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  David Zhang,et al.  A Survey of Sparse Representation: Algorithms and Applications , 2015, IEEE Access.

[48]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[49]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[50]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[51]  H. Barlow Vision: A computational investigation into the human representation and processing of visual information: David Marr. San Francisco: W. H. Freeman, 1982. pp. xvi + 397 , 1983 .

[52]  Song-Chun Zhu,et al.  Primal sketch: Integrating structure and texture , 2007, Comput. Vis. Image Underst..

[53]  Luc Van Gool,et al.  Generative Adversarial Networks for Extreme Learned Image Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[54]  Yang Yang,et al.  2D-to-Stereo Panorama Conversion Using GAN and Concentric Mosaics , 2019, IEEE Access.

[55]  Chinmoy Bhattacharya,et al.  A Discrete Wavelet Transform Approach to Multiresolution Complex SAR Image Generation , 2007, IEEE Geoscience and Remote Sensing Letters.

[56]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57]  Jean-Luc Starck,et al.  Image reconstruction by the wavelet transform applied to aperture synthesis , 1994 .

[58]  Hairong Yang,et al.  A new compressed sensing-based matching pursuit algorithm for image reconstruction , 2012, 2012 5th International Congress on Image and Signal Processing.

[59]  Sim Heng Ong,et al.  Image Analysis by Tchebichef Moments , 2001, IEEE Trans. Image Process..

[60]  Pavan Turaga,et al.  Convolutional Neural Networks for Noniterative Reconstruction of Compressively Sensed Images , 2017, IEEE Transactions on Computational Imaging.

[61]  Yibo Zhang,et al.  Phase recovery and holographic image reconstruction using deep learning in neural networks , 2017, Light: Science & Applications.

[62]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Chuang Gan,et al.  Sparse, Smart Contours to Represent and Edit Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.