Design, analysis and application of a volumetric convolutional neural network

The design, analysis and application of a volumetric convolutional neural network (VCNN) are studied in this work. Although many CNNs have been proposed in the literature, their design is empirical. In the design of the VCNN, we propose a feed-forward K-means clustering algorithm to determine the filter number and size at each convolutional layer systematically. For the analysis of the VCNN, the cause of confusing classes in the output of the VCNN is explained by analyzing the relationship between the filter weights (also known as anchor vectors) from the last fully-connected layer to the output. Furthermore, a hierarchical clustering method followed by a random forest classification method is proposed to boost the classification performance among confusing classes. For the application of the VCNN, we examine the 3D shape classification problem and conduct experiments on a popular ModelNet40 dataset. The proposed VCNN offers the state-of-the-art performance among all volume-based CNN methods.

[1]  Edward K. Wong,et al.  Deepshape: Deep learned shape descriptor for 3D shape matching and retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[3]  Bo Li,et al.  Large-Scale 3D Shape Retrieval from ShapeNet Core55 , 2016, 3DOR@Eurographics.

[4]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[5]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[6]  Bin Fang,et al.  A comparison of 3D shape retrieval methods based on a large-scale benchmark supporting multimodal queries , 2015, Comput. Vis. Image Underst..

[7]  Longin Jan Latecki,et al.  GIFT: A Real-Time and Scalable 3D Shape Search Engine , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  Remco C. Veltkamp,et al.  A Survey of Content Based 3D Shape Retrieval Methods , 2004, SMI.

[10]  Zhichao Zhou,et al.  DeepPano: Deep Panoramic Representation for 3-D Shape Recognition , 2015, IEEE Signal Processing Letters.

[11]  C.-C. Jay Kuo Understanding convolutional neural networks with a mathematical model , 2016, J. Vis. Commun. Image Represent..

[12]  Paul Suetens,et al.  meshSIFT: Local surface features for 3D face recognition under expression variations and partial data , 2013, Comput. Vis. Image Underst..

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Reza Bosagh Zadeh,et al.  FusionNet: 3D Object Classification Using Multiple Data Representations , 2016, ArXiv.

[15]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[16]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Sven J. Dickinson,et al.  Skeleton based shape matching and retrieval , 2003, 2003 Shape Modeling International..

[18]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[21]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[22]  Jian Dong,et al.  Looking Inside Category: Subcategory-Aware Object Recognition , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Reinhard Klein,et al.  Shape retrieval using 3D Zernike descriptors , 2004, Comput. Aided Des..

[24]  Leonidas J. Guibas,et al.  Shape google: Geometric words and expressions for invariant shape retrieval , 2011, TOGS.

[25]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[26]  Ariel Shamir,et al.  Pose-Oblivious Shape Signature , 2007, IEEE Transactions on Visualization and Computer Graphics.

[27]  Trevor Darrell,et al.  Part-Based R-CNNs for Fine-Grained Category Detection , 2014, ECCV.

[28]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[29]  Anne Verroust-Blondet,et al.  A New Descriptor for 2D Depth Image Indexing and 3D Model Retrieval , 2007, 2007 IEEE International Conference on Image Processing.

[30]  Ryutarou Ohbuchi,et al.  SHREC'12 Track: Generic 3D Shape Retrieval , 2012, 3DOR@Eurographics.

[31]  Iasonas Kokkinos,et al.  Scale-invariant heat kernel signatures for non-rigid shape recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[34]  Sebastian Spaeth,et al.  Can Open-Source Hardware Disrupt Manufacturing Industries? The Role of Platforms and Trust in the Rise of 3d Printing , 2016 .

[35]  Ryutarou Ohbuchi,et al.  Salient local visual features for shape-based 3D model retrieval , 2008, 2008 IEEE International Conference on Shape Modeling and Applications.

[36]  Stefan Leutenegger,et al.  Pairwise Decomposition of Image Sequences for Active Multi-view Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Remco C. Veltkamp,et al.  A survey of content based 3D shape retrieval methods , 2004, Proceedings Shape Modeling Applications, 2004..

[38]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).