A New Rotation-Invariant Deep Network for 3D Object Recognition

When inputs are rotated, most 3D convolutional neural networks (CNNs) will have their performance much dropped, especially for those models with voxelized input of 3D objects. The newly proposed Spherical CNNS, with the concept of the rotation-equivariant spherical correlation, aims to achieve rotation invariance. Inspired by this, we propose a new rotation-invariant deep network to recognize rotated 3D objects. Specifically, we adopt the spherical representation and the spherical correlation S^2 layer of Spherical CNNs, for their capacity of representing 3D objects and rotation equivariance. In the meantime, we improve the computational efficiency and expressiveness of Spherical CNNs, by replacing its time-consuming and depth-limited SO(3) layer with a PointNet-style network architecture. Hence our proposed network can maintain the equivariance as the network grows deeper while substantially reducing its runtime, leading to a much better efficiency and expressiveness of rotation-invariant representation. Experimental results show that our network performs better than or comparable to the state-of-the-art methods in the ModelNet40 classification challenge.

[1]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[3]  Gabriel J. Brostow,et al.  Interpretable Transformations with Encoder-Decoder Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[5]  Gabriel J. Brostow,et al.  CubeNet: Equivariance to 3D Rotation and Translation , 2018, ECCV.

[6]  Theodore Lim,et al.  Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[7]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Anath Fischer,et al.  3D Point Cloud Registration for Localization Using a Deep Neural Network Auto-Encoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Max Welling,et al.  Spherical CNNs , 2018, ICLR.

[10]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[13]  A. Makadia,et al.  Learning SO(3) Equivariant Representations with Spherical CNNs , 2019, International Journal of Computer Vision.

[14]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[15]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Yasuyuki Matsushita,et al.  RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Leonidas J. Guibas,et al.  FPNN: Field Probing Neural Networks for 3D Data , 2016, NIPS.

[18]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.