View-based weight network for 3D object recognition

Abstract Projective methods generally achieve better results in 3D object recognition in recent years. This may be similar to that human visual 3D shapes rely on various 2D observations which are unconscious on retina. Each projection is treated fairly in existing methods. However, we note that different viewpoint images of the same object have different discriminative features, and only some of images are completely significant. We propose a novel View-based Weight Network (VWN) for 3D object recognition where the different view-based weights are assigned to different projections. The trainable view-level weights are incorporated as a pooling layer of the multi-view residual network. The pooling layer contains 7 sub-layers. Meanwhile, we find a simple unsupervised criterion to evaluate the prediction results before they output. To improve the recognition accuracy, a new multi-channel integrated classifier combining Extreme Learning Machine, KNN, SVM and Random Forest is proposed based on the criterion. The multi-channel classifier can make the accuracy of Top1 close to Top2. Experiments on Princeton ModelNet 3D datasets demonstrate our proposed method outperforms the state-of-the-art approaches significantly in recognition accuracy.

[1]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Didier Stricker,et al.  Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks , 2018, ECCV.

[3]  Yong Dou,et al.  An efficient and effective convolutional auto-encoder extreme learning machine network for 3d feature learning , 2016, Neurocomputing.

[4]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Andrew Owens,et al.  SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Junsong Yuan,et al.  Multi-view Harmonized Bilinear Network for 3D Object Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Nick Barnes,et al.  Unsupervised Primitive Discovery for Improved 3D Generative Modeling , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yue Gao,et al.  GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Shagan Sah,et al.  General-Purpose Deep Point Cloud Feature Extractor , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[10]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11]  Wei Chen,et al.  Dynamical performance analysis of communication-embedded neural networks: A survey , 2019, Neurocomputing.

[12]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[13]  Wei An,et al.  BV-CNNs: Binary Volumetric Convolutional Networks for 3D Object Recognition , 2017, BMVC.

[14]  Shiming Xiang,et al.  Relation-Shape Convolutional Neural Network for Point Cloud Analysis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[16]  C. Schmid,et al.  On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Song-Chun Zhu,et al.  Learning Descriptor Networks for 3D Shape Synthesis and Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Ligang Liu,et al.  Projective Feature Learning for 3D Shapes with Multi‐View Depth Images , 2015, Comput. Graph. Forum.

[20]  Daniela Giorgi,et al.  Reeb graphs for shape analysis and applications , 2008, Theor. Comput. Sci..

[21]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  Jianxiong Xiao,et al.  SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Stefan Leutenegger,et al.  Pairwise Decomposition of Image Sequences for Active Multi-view Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Junwei Han,et al.  3D2SeqViews: Aggregating Sequential Views for 3D Global Feature Learning by CNN With Hierarchical Attention Aggregation , 2019, IEEE Transactions on Image Processing.

[25]  Marcus A. Badgeley,et al.  Wide and deep volumetric residual networks for volumetric image classification , 2017, ArXiv.

[26]  Dong Tian,et al.  FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[28]  Yue Gao,et al.  View-based 3D model retrieval with probabilistic graph model , 2010, Neurocomputing.

[29]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[30]  Shanmuganathan Raman,et al.  LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).