Cross Diffusion on Multi-hypergraph for Multi-modal 3D Object Recognition

3D object recognition is a longstanding task in computer vision and has shown wide applications in computer aided design, virtual reality, etc. Current state-of-the-art methods mainly focus on 3D object representation for recognition. Concerning the multi-modal representations in practice, how to effectively combine such multi-modal information for recognition is still a challenging and urgent requirement. In this paper, we aim to conduct 3D object recognition using multi-modal information through a cross diffusion process on multi-hypergraph structure. Given multi-modal representations of 3D objects, the correlation among these objects is formulated using the multi-hypergraph structure each representation separately, which is able to model complex relationship among objects. To combine multi-modal representation, we propose a cross diffusion process on multi-hypergraph, in which the label information is propagated from multiple hypergraphs alternatively. In this way, the multi-modal information can be jointly combined through this cross diffusion process in multi-hypergraph structure. We have applied the proposed method in 3D object recognition using multiple representations. To evaluate the performance of the proposed cross diffusion method, we provide extensive experiments on two public 3D object datasets. Experimental results demonstrate that the proposed method can achieve satisfied multi-modal combination performance and outperform the current state-of-the-art methods.

[1]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Edwin R. Hancock,et al.  Joint hypergraph learning and sparse regression for feature selection , 2017, Pattern Recognit..

[3]  Ioannis Pratikakis,et al.  PANORAMA: A 3D Shape Descriptor Based on Panoramic Views for Unsupervised 3D Object Retrieval , 2010, International Journal of Computer Vision.

[4]  Meng Wang,et al.  Multi-View Object Retrieval via Multi-Scale Topic Models , 2016, IEEE Transactions on Image Processing.

[5]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Yue Gao,et al.  3-D Object Retrieval and Recognition With Hypergraph Analysis , 2012, IEEE Transactions on Image Processing.

[7]  Sainan Liu,et al.  Attentional ShapeContextNet for Point Cloud Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Qingshan Liu,et al.  Elastic Net Hypergraph Learning for Image Clustering and Semi-Supervised Classification , 2016, IEEE Transactions on Image Processing.

[9]  Ling Shao,et al.  Deep Nonlinear Metric Learning for 3-D Shape Retrieval , 2018, IEEE Transactions on Cybernetics.

[10]  Qingshan Liu,et al.  Image retrieval via probabilistic hypergraph ranking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Dong Tian,et al.  Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Massimiliano Pontil,et al.  Support Vector Machines for 3D Object Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[15]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[16]  Hai Jin,et al.  Content-Based Visual Landmark Search via Multimodal Hypergraph Learning , 2015, IEEE Transactions on Cybernetics.

[17]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[20]  Xuelong Li,et al.  Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search , 2013, IEEE Transactions on Image Processing.

[21]  Dietmar Saupe,et al.  3D Shape Descriptor Based on 3D Fourier Transform , 2001 .

[22]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Yue Gao,et al.  GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Roger Zimmermann,et al.  Flickr Circles: Aesthetic Tendency Discovery by Multi-View Regularized Topic Modeling , 2016, IEEE Transactions on Multimedia.

[27]  Yue Gao,et al.  Multi-View 3D Object Retrieval With Deep Embedding Network , 2016, IEEE Transactions on Image Processing.

[28]  TaeHyun Hwang,et al.  A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge , 2009, Bioinform..

[29]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[31]  Qingshan Liu,et al.  Video object segmentation by hypergraph cut , 2009, CVPR.