论文信息 - Learning SO(3) Equivariant Representations with Spherical CNNs

Learning SO(3) Equivariant Representations with Spherical CNNs

We address the problem of 3D rotation equivariance in convolutional neural networks. 3D rotations have been a challenging nuisance in 3D classification tasks requiring higher capacity and extended data augmentation in order to tackle it. We model 3D data with multi-valued spherical functions and we propose a novel spherical convolutional network that implements exact convolutions on the sphere by realizing them in the spherical harmonic domain. Resulting filters have local symmetry and are localized by enforcing smooth spectra. We apply a novel pooling on the spectral domain and our operations are independent of the underlying spherical resolution throughout the network. We show that networks with much lower capacity and without requiring data augmentation can exhibit performance comparable to the state of the art in standard 3D shape retrieval and classification benchmarks.

A. Makadia | Kostas Daniilidis | Carlos Esteves | Christine Allen-Blanchette

[1] R. A. Silverman,et al. Special functions and their applications , 1966 .

[2] G. Arfken. Mathematical Methods for Physicists , 1967 .

[3] Yehoshua Y. Zeevi,et al. The Canonical Coordinates Method for Pattern Deformation: Theoretical and Computational Considerations , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[4] D. Healy,et al. Computing Fourier Transforms and Convolutions on the 2-Sphere , 1994 .

[5] Sean S. B. Moore,et al. FFTs for the 2-Sphere-Improvements and Variations , 1996 .

[6] W. Thurston,et al. Three-Dimensional Geometry and Topology, Volume 1: Volume 1 , 1997 .

[7] W. Thurston,et al. Three-Dimensional Geometry and Topology, Volume 1 , 1997, The Mathematical Gazette.

[8] Thomas A. Funkhouser,et al. Harmonic 3D shape matching , 2002, SIGGRAPH '02.

[9] Jitendra Malik,et al. Recognizing Objects in Range Data Using Regional Point Descriptors , 2004, ECCV.

[10] Yacov Hel-Or,et al. Canonical Decomposition of Steerable Functions , 2004, Journal of Mathematical Imaging and Vision.

[11] K. Gorski,et al. HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data Distributed on the Sphere , 2004, astro-ph/0409513.

[12] Masaki Aono,et al. Multi-Fourier spectra descriptor and augmentation with spectral clustering for 3D shape retrieval , 2009, The Visual Computer.

[13] Kostas Daniilidis,et al. Spherical Correlation of Visual Representations for 3D Model Retrieval , 2009, International Journal of Computer Vision.

[14] Dima Damen,et al. Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Joan Bruna,et al. Learning Stable Group Invariant Representations with Convolutional Networks , 2013, ICLR.

[16] Pedro M. Domingos,et al. Deep Symmetry Networks , 2014, NIPS.

[17] Joan Bruna,et al. Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[18] Jasper Snoek,et al. Spectral Representations for Convolutional Neural Networks , 2015, NIPS.

[19] Subhransu Maji,et al. Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20] Andrea Vedaldi,et al. Understanding Image Representations by Measuring Their Equivariance and Equivalence , 2014, International Journal of Computer Vision.

[21] Sebastian Scherer,et al. VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Jianxiong Xiao,et al. 3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[25] Pierre Vandergheynst,et al. Geodesic Convolutional Neural Networks on Riemannian Manifolds , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[26] Sander Dieleman,et al. Rotation-invariant convolutional neural networks for galaxy morphology prediction , 2015, ArXiv.

[27] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[28] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[29] Longin Jan Latecki,et al. GIFT: A Real-Time and Scalable 3D Shape Search Engine , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Leonidas J. Guibas,et al. Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Hao Su,et al. SHREC ’ 17 Track Large-Scale 3 D Shape Retrieval from ShapeNet Core 55 , 2016 .

[32] Jonathan Masci,et al. Learning shape correspondence with anisotropic convolutional neural networks , 2016, NIPS.

[33] Xavier Bresson,et al. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[34] Ryutarou Ohbuchi,et al. Deep Aggregation of Local 3D Geometric Features for 3D Model Retrieval , 2016, BMVC.

[35] Pierre Vandergheynst,et al. Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[36] Stephan J. Garbin,et al. Harmonic Networks: Deep Translation and Rotation Equivariance , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Qiang Qiu,et al. Oriented Response Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[39] Large-Scale 3D Shape Retrieval from ShapeNet Core55 , 2017, 3DOR@Eurographics.

[40] Leonidas J. Guibas,et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[41] Leonidas J. Guibas,et al. SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Nikos Komodakis,et al. Rotation Equivariant Vector Field Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[44] Jonathan Masci,et al. Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Victor S. Lempitsky,et al. Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46] Jiaxin Li,et al. SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47] Yasuyuki Matsushita,et al. RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48] Max Welling,et al. Spherical CNNs , 2018, ICLR.

[49] Richard Zhang,et al. Making Convolutional Networks Shift-Invariant Again , 2019, ICML.

[50] Yue Wang,et al. Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..