Progressive conditional GAN-based augmentation for 3D object recognition

Abstract We consider the 3D object recognition problem from the perspective of the lack of labelled data. In this paper, we propose a novel progressive conditional generative adversarial network (PC-GAN) for 3D object recognition by conditioning the input with progressive learning strategies. PC-GAN is a powerful adversarial model whose generator automatically produces realistic 3D objects with annotations, and the discriminator distinguishes them from the training distribution and recognizes their categories. We train the discriminative classifier simultaneously with the generator to predict the class label by embedding a SoftMax classifier. Progressive learning uses input samples from lower to higher resolutions to increase the generator performance gradually and produce informative objects for a certain class of objects. The key idea of adopting progressing learning is to mitigate overshoots issues of the discriminator and increase variations in the generated objects by learning progressively. This strategy helps the generator to produce more realistic synthetic objects and improve the active classification performance of the discriminator. Our proposed PC-GAN is trained for object classification in a supervised manner and the performance is evaluated on two public datasets. Experimental results demonstrate that our adversarial PC-GAN outperforms the existing volumetric discriminative classifiers in term of classification accuracy.

[1]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Mohammed Bennamoun,et al.  A novel feature representation for automatic 3D object recognition in cluttered scenes , 2016, Neurocomputing.

[3]  Naimat Ullah Khan,et al.  3D Object Classification Using a Volumetric Deep Neural Network: An Efficient Octree Guided Auxiliary Learning Approach , 2020, IEEE Access.

[4]  S Y Cheng,et al.  GAN-Based Augmentation for Improving CNN Performance of Classification of Defective Photovoltaic Module Cells in Electroluminescence Images , 2019, IOP Conference Series: Earth and Environmental Science.

[5]  Mohammed Bennamoun,et al.  Image-Based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[8]  David Moloney,et al.  1.2 Watt Classification of 3D Voxel Based Point-clouds using a CNN on a Neural Compute Stick , 2020, Neurocomputing.

[9]  Li Hou,et al.  A New Volumetric CNN for 3D Object Classification Based on Joint Multiscale Feature and Subvolume Supervised Learning Approaches , 2020, Comput. Intell. Neurosci..

[10]  Theodore Lim,et al.  Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[11]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Shiming Xiang,et al.  Relation-Shape Convolutional Neural Network for Point Cloud Analysis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Song-Chun Zhu,et al.  Learning Descriptor Networks for 3D Shape Synthesis and Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Mohammed Bennamoun,et al.  3D Object Recognition in Cluttered Scenes with Local Surface Features: A Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[18]  Mohammed Bennamoun,et al.  A Comprehensive Performance Evaluation of 3D Local Feature Descriptors , 2015, International Journal of Computer Vision.

[19]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Augustus Odena,et al.  Semi-Supervised Learning with Generative Adversarial Networks , 2016, ArXiv.

[21]  Yasuyuki Matsushita,et al.  RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Yue Gao,et al.  MLVCNN: Multi-Loop-View Convolutional Neural Network for 3D Shape Retrieval , 2019, AAAI.

[23]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[24]  Thomas Brox,et al.  Orientation-boosted Voxel Nets for 3D Object Recognition , 2016, BMVC.

[25]  Lars Petersson,et al.  3DCapsule: Extending the Capsule Architecture to Classify 3D Point Clouds , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[26]  Xiang Li,et al.  Toward real-time 3D object recognition: A lightweight volumetric CNN framework using multitask learning , 2017, Comput. Graph..

[27]  Matthias Zwicker,et al.  View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions , 2018, AAAI.

[28]  Sang Min Yoon,et al.  Sketch-based 3D object recognition from locally optimized sparse features , 2017, Neurocomputing.

[29]  Bin Tong,et al.  Active Generative Adversarial Network for Image Classification , 2019, AAAI.

[30]  Dong Tian,et al.  FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Li Hou,et al.  CurveNet: Curvature-Based Multitask Learning Deep Networks for 3D Object Recognition , 2021, IEEE/CAA Journal of Automatica Sinica.

[32]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[33]  Blesson Varghese,et al.  Resource Management in Fog/Edge Computing , 2018, ACM Comput. Surv..

[34]  Mohammed Bennamoun,et al.  Deep learning-based 3D local feature descriptor from Mercator projections , 2019, Comput. Aided Geom. Des..

[35]  Hayit Greenspan,et al.  GAN-based Synthetic Medical Image Augmentation for increased CNN Performance in Liver Lesion Classification , 2018, Neurocomputing.

[36]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[37]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[38]  Nick Barnes,et al.  Unsupervised Primitive Discovery for Improved 3D Generative Modeling , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[40]  Oliver Grau,et al.  VConv-DAE: Deep Volumetric Shape Learning Without Object Labels , 2016, ECCV Workshops.

[41]  Dan Song,et al.  Multi-View Hierarchical Fusion Network for 3D Object Retrieval and Classification , 2019, IEEE Access.

[42]  Chi-Man Vong,et al.  Unsupervised Learning of 3-D Local Features From Raw Voxels Based on a Novel Permutation Voxelization Strategy , 2019, IEEE Transactions on Cybernetics.

[43]  Zhuowen Tu,et al.  3D Volumetric Modeling with Introspective Neural Networks , 2019, AAAI.

[44]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Mohammed Bennamoun,et al.  NormalNet: A voxel-based CNN for 3D object classification and retrieval , 2019, Neurocomputing.

[46]  Wei An,et al.  Learning Multi-View Representation With LSTM for 3-D Shape Recognition and Retrieval , 2019, IEEE Transactions on Multimedia.

[47]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).