DAN: Deep-Attention Network for 3D Shape Recognition

Due to the wide applications in a rapidly increasing number of different fields, 3D shape recognition has become a hot topic in the computer vision field. Many approaches have been proposed in recent years. However, there remain huge challenges in two aspects: exploring the effective representation of 3D shapes and reducing the redundant complexity of 3D shapes. In this paper, we propose a novel deep-attention network (DAN) for 3D shape representation based on multiview information. More specifically, we introduce the attention mechanism to construct a deep multiattention network that has advantages in two aspects: 1) information selection, in which DAN utilizes the self-attention mechanism to update the feature vector of each view, effectively reducing the redundant information, and 2) information fusion, in which DAN applies attention mechanism that can save more effective information by considering the correlations among views. Meanwhile, deep network structure can fully consider the correlations to continuously fuse effective information. To validate the effectiveness of our proposed method, we conduct experiments on the public 3D shape datasets: ModelNet40, ModelNet10, and ShapeNetCore55. Experimental results and comparison with state-of-the-art methods demonstrate the superiority of our proposed method. Code is released on https://github.com/RiDang/DANN.

[1]  Yang Cong,et al.  L3DOC: Lifelong 3D Object Classification , 2019, 1912.06135.

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Ludovico Minto,et al.  Deep Learning for 3D Shape Classification based on Volumetric Density and Surface Approximation Clues , 2018, VISIGRAPP.

[5]  Qiang Huang,et al.  View-based weight network for 3D object recognition , 2020, Image Vis. Comput..

[6]  Zhichao Zhou,et al.  DeepPano: Deep Panoramic Representation for 3-D Shape Recognition , 2015, IEEE Signal Processing Letters.

[7]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[8]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[9]  Yueting Zhuang,et al.  Video Dialog via Multi-Grained Convolutional Self-Attention Context Networks , 2019, SIGIR.

[10]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[12]  Shiming Xiang,et al.  Relation-Shape Convolutional Neural Network for Point Cloud Analysis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Mark Segal,et al.  The Design of the OpenGL Graphics Interface , 1998 .

[14]  Yue Gao,et al.  GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Yue Gao,et al.  PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition , 2018, ACM Multimedia.

[16]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[17]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  Liwei Wang,et al.  Learning Relationships for Multi-View 3D Object Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Junwei Han,et al.  3D2SeqViews: Aggregating Sequential Views for 3D Global Feature Learning by CNN With Hierarchical Attention Aggregation , 2019, IEEE Transactions on Image Processing.

[21]  Ming Hao,et al.  Linked Dynamic Graph CNN: Learning on Point Cloud via Linking Hierarchical Features , 2019, ArXiv.

[22]  Xuanjing Huang,et al.  Adaptive Multi-Attention Network Incorporating Answer Information for Duplicate Question Detection , 2019, SIGIR.

[23]  Stefan Leutenegger,et al.  Pairwise Decomposition of Image Sequences for Active Multi-view Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Junwei Han,et al.  SeqViews2SeqLabels: Learning 3D Global Features via Aggregating Sequential Views by RNN With Attention , 2019, IEEE Transactions on Image Processing.

[26]  Longin Jan Latecki,et al.  Automatic Ensemble Diffusion for 3D Shape and Image Retrieval , 2019, IEEE Transactions on Image Processing.

[27]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[28]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Ryutarou Ohbuchi,et al.  Deep Aggregation of Local 3D Geometric Features for 3D Model Retrieval , 2016, BMVC.

[30]  Shanmuganathan Raman,et al.  LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Zijian Zhang,et al.  Query-Biased Self-Attentive Network for Query-Focused Video Summarization , 2020, IEEE Transactions on Image Processing.

[32]  Nick Barnes,et al.  Blended Convolution and Synthesis for Efficient Discrimination of 3D Shapes , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[33]  Jonathan Masci,et al.  Learning shape correspondence with anisotropic convolutional neural networks , 2016, NIPS.

[34]  Guoliang Fan,et al.  Supervised Deep-Autoencoder for Depth Image-Based 3D Model Retrieval , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[35]  Longin Jan Latecki,et al.  GIFT: A Real-Time and Scalable 3D Shape Search Engine , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[37]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Kun Wang,et al.  PANORAMA-Based Multi-Scale and Multi-Channel CNN for 3D Model Retrieval , 2018, 2018 IEEE Visual Communications and Image Processing (VCIP).

[39]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[40]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[41]  Yaron Lipman,et al.  Point convolutional neural networks by extension operators , 2018, ACM Trans. Graph..

[42]  Leonidas J. Guibas,et al.  FPNN: Field Probing Neural Networks for 3D Data , 2016, NIPS.

[43]  Andrew Zisserman,et al.  Fisher Vector Faces in the Wild , 2013, BMVC.

[44]  Daniel Cremers,et al.  Anisotropic Diffusion Descriptors , 2016, Comput. Graph. Forum.

[45]  Xiaogang Wang,et al.  SCAN: Self-and-Collaborative Attention Network for Video Person Re-Identification , 2018, IEEE Transactions on Image Processing.

[46]  Sainan Liu,et al.  Attentional ShapeContextNet for Point Cloud Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Ioannis Pratikakis,et al.  Efficient 3D shape matching and retrieval using a concrete radialized spherical projection representation , 2007, Pattern Recognit..

[48]  Didier Stricker,et al.  Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks , 2018, ECCV.

[49]  Karthik Ramani,et al.  Deep Learning 3D Shape Surfaces Using Geometry Images , 2016, ECCV.

[50]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[51]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Yue Gao,et al.  PVRNet: Point-View Relation Neural Network for 3D Shape Recognition , 2018, AAAI.

[53]  Ioannis Pratikakis,et al.  Exploiting the PANORAMA Representation for Convolutional Neural Network Classification and Retrieval , 2017, 3DOR@Eurographics.

[54]  Yasuyuki Matsushita,et al.  RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.