论文信息 - Learning SO(3) Equivariant Representations with Spherical CNNs

Learning SO(3) Equivariant Representations with Spherical CNNs

We address the problem of 3D rotation equivariance in convolutional neural networks. 3D rotations have been a challenging nuisance in 3D classification tasks requiring higher capacity and extended data augmentation in order to tackle it. We model 3D data with multi-valued spherical functions and we propose a novel spherical convolutional network that implements exact convolutions on the sphere by realizing them in the spherical harmonic domain. Resulting filters have local symmetry and are localized by enforcing smooth spectra. We apply a novel pooling on the spectral domain and our operations are independent of the underlying spherical resolution throughout the network. We show that networks with much lower capacity and without requiring data augmentation can exhibit performance comparable to the state of the art in standard retrieval and classification benchmarks.

[1] Joan Bruna,et al. Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[2] Sean S. B. Moore,et al. FFTs for the 2-Sphere-Improvements and Variations , 1996 .

[3] Sebastian Scherer,et al. VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4] Sander Dieleman,et al. Rotation-invariant convolutional neural networks for galaxy morphology prediction , 2015, ArXiv.

[5] Kostas Daniilidis,et al. Spherical Correlation of Visual Representations for 3D Model Retrieval , 2009, International Journal of Computer Vision.

[6] Xavier Bresson,et al. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[7] Thomas A. Funkhouser,et al. Harmonic 3D shape matching , 2002, SIGGRAPH '02.

[8] Longin Jan Latecki,et al. GIFT: A Real-Time and Scalable 3D Shape Search Engine , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[11] Pierre Vandergheynst,et al. Geodesic Convolutional Neural Networks on Riemannian Manifolds , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[12] Leonidas J. Guibas,et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[13] Jianxiong Xiao,et al. 3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Pedro M. Domingos,et al. Deep Symmetry Networks , 2014, NIPS.

[15] Max Welling,et al. Spherical CNNs , 2018, ICLR.

[16] Ryutarou Ohbuchi,et al. Deep Aggregation of Local 3D Geometric Features for 3D Model Retrieval , 2016, BMVC.

[17] Andrea Vedaldi,et al. Understanding Image Representations by Measuring Their Equivariance and Equivalence , 2014, International Journal of Computer Vision.

[18] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[19] Max Welling,et al. Group Equivariant Convolutional Networks , 2016, ICML.

[20] Yehoshua Y. Zeevi,et al. The Canonical Coordinates Method for Pattern Deformation: Theoretical and Computational Considerations , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[21] Jitendra Malik,et al. Recognizing Objects in Range Data Using Regional Point Descriptors , 2004, ECCV.

[22] Jonathan Masci,et al. Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Joan Bruna,et al. Learning Stable Group Invariant Representations with Convolutional Networks , 2013, ICLR.

[24] Jasper Snoek,et al. Spectral Representations for Convolutional Neural Networks , 2015, NIPS.

[25] Stephan J. Garbin,et al. Harmonic Networks: Deep Translation and Rotation Equivariance , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Hao Su,et al. SHREC ’ 17 Track Large-Scale 3 D Shape Retrieval from ShapeNet Core 55 , 2016 .

[27] Yacov Hel-Or,et al. Canonical Decomposition of Steerable Functions , 2004, Journal of Mathematical Imaging and Vision.

[28] Pierre Vandergheynst,et al. Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[29] Jonathan Masci,et al. Learning shape correspondence with anisotropic convolutional neural networks , 2016, NIPS.

[30] G. Arfken. Mathematical Methods for Physicists , 1967 .

[31] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Nikos Komodakis,et al. Rotation Equivariant Vector Field Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[33] W. Thurston,et al. Three-Dimensional Geometry and Topology, Volume 1: Volume 1 , 1997 .

[34] Yasuyuki Matsushita,et al. RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35] Leonidas J. Guibas,et al. Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Subhransu Maji,et al. Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37] Qiang Qiu,et al. Oriented Response Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] D. Healy,et al. Computing Fourier Transforms and Convolutions on the 2-Sphere , 1994 .

[39] Masaki Aono,et al. Multi-Fourier spectra descriptor and augmentation with spectral clustering for 3D shape retrieval , 2009, The Visual Computer.

[40] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.