论文信息 - Learning Relationships for Multi-View 3D Object Recognition

Learning Relationships for Multi-View 3D Object Recognition

Recognizing 3D object has attracted plenty of attention recently, and view-based methods have achieved best results until now. However, previous view-based methods ignore the region-to-region and view-to-view relationships between different view images, which are crucial for multi-view 3D object representation. To tackle this problem, we propose a Relation Network to effectively connect corresponding regions from different viewpoints, and therefore reinforce the information of individual view image. In addition, the Relation Network exploits the inter-relationships over a group of views, and integrates those views to obtain a discriminative 3D object representation. Systematic experiments conducted on ModelNet dataset demonstrate the effectiveness of our proposed methods for both 3D object recognition and retrieval tasks.

Liwei Wang | Ze Yang | Liwei Wang | Ze Yang

[1] Szymon Rusinkiewicz,et al. Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[2] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3] Bernard Chazelle,et al. Shape distributions , 2002, TOGS.

[4] Jianxiong Xiao,et al. 3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Victor S. Lempitsky,et al. Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6] Deva Ramanan,et al. Attentional Pooling for Action Recognition , 2017, NIPS.

[7] Junsong Yuan,et al. Multi-view Harmonized Bilinear Network for 3D Object Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8] Razvan Pascanu,et al. Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[9] Hiroshi Murase,et al. Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[10] Luc Van Gool,et al. Hough Transform and 3D SURF for Robust Three Dimensional Classification , 2010, ECCV.

[11] Berthold K. P. Horn. Extended Gaussian images , 1984, Proceedings of the IEEE.

[12] Iasonas Kokkinos,et al. Intrinsic shape context descriptors for deformable shapes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Leonidas J. Guibas,et al. Shape google: Geometric words and expressions for invariant shape retrieval , 2011, TOGS.

[14] Theodore Lim,et al. Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[15] Sebastian Scherer,et al. VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16] Kaleem Siddiqi,et al. Dominant Set Clustering and Pooling for Multi-View 3D Object Recognition , 2019, BMVC.

[17] Yue Gao,et al. GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.

[19] Razvan Pascanu,et al. Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.

[20] Leonidas J. Guibas,et al. FPNN: Field Probing Neural Networks for 3D Data , 2016, NIPS.

[21] Yasuyuki Matsushita,et al. RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[23] Stefan Leutenegger,et al. Pairwise Decomposition of Image Sequences for Active Multi-view Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Leonidas J. Guibas,et al. Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[26] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Thomas Brox,et al. Orientation-boosted Voxel Nets for 3D Object Recognition , 2016, BMVC.

[30] Ming Ouhyoung,et al. On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[31] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Subhransu Maji,et al. Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34] Yichen Wei,et al. Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35] Andrew Zisserman,et al. Fisher Vector Faces in the Wild , 2013, BMVC.

[36] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[38] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[39] Leonidas J. Guibas,et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[40] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[41] Subhransu Maji,et al. SPLATNet: Sparse Lattice Networks for Point Cloud Processing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Longin Jan Latecki,et al. GIFT: A Real-Time and Scalable 3D Shape Search Engine , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Asim Kadav,et al. Attend and Interact: Higher-Order Object Interactions for Video Understanding , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.