3D Capsule Networks for Object Classification from 3D Model Data

Many of the existing object classification methods today rely on convolutional neural networks (CNNs), which are very successful in extracting features from the data. However, CNNs cannot sufficiently address the spatial relationship between features and require large amounts of data for training. In this paper, a new architecture is proposed for 3D object classification, which is an extension of the Capsule Networks (CapsNets) to 3D data. Our proposed 3D CapsNet architecture preserves the orientation and spatial relationship of the extracted features, and thus requires less data to train the network. We compare our approach with a ShapeNet inspired model, and show that our method provides performance improvement especially when training data size gets smaller. We also compare and evaluate several different versions of the 3D Capsnet architecture.

[1]  Ioannis Pratikakis,et al.  Exploiting the PANORAMA Representation for Convolutional Neural Network Classification and Retrieval , 2017, 3DOR@Eurographics.

[2]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[5]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[6]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[7]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.