论文信息 - Distortion-aware CNNs for Spherical Images

Distortion-aware CNNs for Spherical Images

Convolutional neural networks are widely used in computer vision applications. Although they have achieved great success, these networks can not be applied to 360◦ spherical images directly due to varying distortion effect. In this paper, we present distortion-aware convolutional network for spherical images. For each pixel, our network samples a non-regular grid based on its distortion level, and convolves the sampled grid using square kernels shared by all pixels. The network successively approximates large image patches from different tangent planes of viewing sphere with small local sampling grids, thus improves the computational efficiency. Our method also deals with the boundary problem, which is an inherent issue for spherical images. To evaluate our method, we apply our network in spherical image classification problems based on transformed MNIST and CIFAR-10 datasets. Compared with the baseline method, our method can get much better performance. We also analyze the variants of our network.

[1] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[2] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[3] R. Rosenfeld. Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[4] Tien-Tsin Wong,et al. Cube2Video: Navigate Between Cubic Panoramas in Real-Time , 2013, IEEE Transactions on Multimedia.

[5] Max Welling,et al. Convolutional Networks for Spherical Signals , 2017, ArXiv.

[6] Ming-Yu Liu,et al. Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Yann LeCun,et al. Spectral Networks and Deep Locally Connected Networks on Graphs , 2014 .

[8] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[9] Ming-Hsuan Yang,et al. Semantic-driven Generation of Hyperlapse from 360° Video , 2017, ArXiv.

[10] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11] Hui Zhang,et al. Efficient 3D Room Shape Recovery from a Single Panorama , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Xiaoou Tang,et al. Accelerating the Super-Resolution Convolutional Neural Network , 2016, ECCV.

[13] Jana Kosecka,et al. Piecewise planar city 3D modeling from street view panoramic sequences , 2009, CVPR.

[14] Matthew A. Brown,et al. Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[15] Kristen Grauman,et al. Flat2Sphere: Learning Spherical Convolution for Fast Features from 360° Imagery , 2017, NIPS 2017.