DRCNN: Dynamic Routing Convolutional Neural Network for Multi-View 3D Object Recognition

3D object recognition is one of the most important tasks in 3D data processing, and has been extensively studied recently. Researchers have proposed various 3D recognition methods based on deep learning, among which a class of view-based approaches is a typical one. However, in the view-based methods, the commonly used view pooling layer to fuse multi-view features causes a loss of visual information. To alleviate this problem, in this paper, we construct a novel layer called Dynamic Routing Layer (DRL) by modifying the dynamic routing algorithm of capsule network, to more effectively fuse the features of each view. Concretely, in DRL, we use rearrangement and affine transformation to convert features, then leverage the modified dynamic routing algorithm to adaptively choose the converted features, instead of ignoring all but the most active feature in view pooling layer. We also illustrate that the view pooling layer is a special case of our DRL. In addition, based on DRL, we further present a Dynamic Routing Convolutional Neural Network (DRCNN) for multi-view 3D object recognition. Our experiments on three 3D benchmark datasets show that our proposed DRCNN outperforms many state-of-the-arts, which demonstrates the efficacy of our method.

[1]  Martin Burtscher,et al.  SHREC’14 Track: Extended Large Scale Sketch-Based 3D Shape Retrieval , 2014 .

[2]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[3]  Ryutarou Ohbuchi,et al.  Ranking on Cross-Domain Manifold for Sketch-Based 3D Model Retrieval , 2013, 2013 International Conference on Cyberworlds.

[4]  Geoffrey E. Hinton,et al.  Matrix capsules with EM routing , 2018, ICLR.

[5]  Kaleem Siddiqi,et al.  Dominant Set Clustering and Pooling for Multi-View 3D Object Recognition , 2019, BMVC.

[6]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Wei An,et al.  Learning Multi-View Representation With LSTM for 3-D Shape Recognition and Retrieval , 2019, IEEE Transactions on Multimedia.

[8]  Qi Tian,et al.  GIFT: Towards Scalable 3D Shape Retrieval , 2017, IEEE Transactions on Multimedia.

[9]  M. Eitz,et al.  Sketch-based 3 D shape retrieval , 2010 .

[10]  Theodore Lim,et al.  Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[11]  Kyoung Mu Lee,et al.  SPNet: Deep 3D Object Classification and Retrieval using Stereographic Projection , 2018, ACCV.

[12]  Remco C. Veltkamp,et al.  A survey of content based 3D shape retrieval methods , 2004, Proceedings Shape Modeling Applications, 2004..

[13]  Federico Tombari,et al.  3D Point Capsule Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Shiming Xiang,et al.  Relation-Shape Convolutional Neural Network for Point Cloud Analysis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Ioannis Pratikakis,et al.  Ensemble of PANORAMA-based convolutional neural networks for 3D model classification and retrieval , 2017, Comput. Graph..

[17]  Junsong Yuan,et al.  Multi-view Harmonized Bilinear Network for 3D Object Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Yasuyuki Matsushita,et al.  RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[20]  Lois M. L. Delcambre,et al.  Discounted Cumulated Gain Based Evaluation of Multiple-Query IR Sessions , 2008, ECIR.

[21]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yue Gao,et al.  MLVCNN: Multi-Loop-View Convolutional Neural Network for 3D Shape Retrieval , 2019, AAAI.

[23]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[24]  Subhransu Maji,et al.  A Deeper Look at 3D Shape Classifiers , 2018, ECCV Workshops.

[25]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[26]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Junwei Han,et al.  3D2SeqViews: Aggregating Sequential Views for 3D Global Feature Learning by CNN With Hierarchical Attention Aggregation , 2019, IEEE Transactions on Image Processing.

[29]  Qi Tian,et al.  Ensemble Diffusion for Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[31]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Yi Fang,et al.  Learning Barycentric Representations of 3D Shapes for Sketch-Based 3D Shape Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Stefan Leutenegger,et al.  Pairwise Decomposition of Image Sequences for Active Multi-view Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Junwei Han,et al.  SeqViews2SeqLabels: Learning 3D Global Features via Aggregating Sequential Views by RNN With Attention , 2019, IEEE Transactions on Image Processing.

[35]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[36]  M. Fatih Demirci,et al.  3D object retrieval using many-to-many matching of curve skeletons , 2005, International Conference on Shape Modeling and Applications 2005 (SMI' 05).

[37]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[38]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Masaki Aono,et al.  A large-scale Shape Benchmark for 3D object retrieval: Toyohashi shape benchmark , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[40]  Kihyuk Sohn,et al.  Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[41]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Bo Li,et al.  SHREC'13 Track: Large Scale Sketch-Based 3D Shape Retrieval , 2013, 3DOR@Eurographics.

[43]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[44]  Song Bai,et al.  Triplet-Center Loss for Multi-view 3D Object Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Lars Petersson,et al.  3DCapsule: Extending the Capsule Architecture to Classify 3D Point Clouds , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[46]  Yue Gao,et al.  GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[48]  Fang Wang,et al.  Sketch-based 3D shape retrieval using Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Bo Li,et al.  Extended Large Scale Sketch-Based 3D Shape Retrieval , 2014, 3DOR@Eurographics.

[50]  I. K. Mak,et al.  Adaptive Random Testing , 2004, ASIAN.

[51]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Matthias Zwicker,et al.  View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions , 2018, AAAI.

[53]  Han Sun,et al.  Learning With Batch-Wise Optimal Transport Loss for 3D Shape Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Yi Fang,et al.  Deep Correlated Metric Learning for Sketch-based 3D Shape Retrieval , 2017, AAAI.

[55]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[56]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).