Deep Auxiliary Learning for Point Cloud Generation

Generation point cloud from single image is a classical problem in computer vision. The learning methods for this task often adopt local distance metrics as loss function, which means the generated points are not easy to meet the overall shape distribution of the target object. To solve this problem, we introduce a voxel reconstruction network with distribution fitting as auxiliary task and propose a novel framework named Voxel-Assisted Points Generation Network(VAPGN). The auxiliary learning with voxel generation makes it easier to capture the shape distribution of objects in the image during the encoder phase, thereby effectively improving the result of point cloud reconstruction. To meet the needs of mobile and embedded applications, a mobile version of the model is also proposed. In the experiments, we verify the feasibility of our network on the ShapeNet dataset. The proposed framework has achieved outstanding performance on the point cloud generation task, comparing with various state-of-the-art methods.

[1]  Gabriel Taubin,et al.  The ball-pivoting algorithm for surface reconstruction , 1999, IEEE Transactions on Visualization and Computer Graphics.

[2]  Kia Ng,et al.  Automated reconstruction of 3D models from real environments , 1999 .

[3]  Reinhard Koch,et al.  Self-Calibration and Metric Reconstruction Inspite of Varying and Unknown Intrinsic Camera Parameters , 1999, International Journal of Computer Vision.

[4]  Ian D. Reid,et al.  Single View Metrology , 2000, International Journal of Computer Vision.

[5]  Sabry F. El-Hakim,et al.  Detailed 3D reconstruction of large-scale heritage sites with integrated techniques , 2004, IEEE Computer Graphics and Applications.

[6]  Bernhard Wieneke,et al.  Tomographic particle image velocimetry , 2006 .

[7]  Fabio Remondino,et al.  Image‐based 3D Modelling: A Review , 2006 .

[8]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[9]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[10]  Fabio Bruno,et al.  From 3D reconstruction to virtual reality: A complete methodology for digital archaeological exhibition , 2010 .

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Xiaowu Chen,et al.  3D Mesh Labeling via Deep Convolutional Neural Networks , 2015, ACM Trans. Graph..

[13]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[16]  Leonidas J. Guibas,et al.  ObjectNet3D: A Large Scale Database for 3D Object Recognition , 2016, ECCV.

[17]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[18]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[19]  Honglak Lee,et al.  Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[20]  Mohan S. Kankanhalli,et al.  Hierarchical Clustering Multi-Task Learning for Joint Human Action Grouping and Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[22]  Lu Fang,et al.  SurfaceNet: An End-to-End 3D Neural Network for Multiview Stereopsis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ioannis A. Kakadiaris,et al.  End-to-End 3D Face Reconstruction with Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jitendra Malik,et al.  Learning a Multi-View Stereo Machine , 2017, NIPS.

[26]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[27]  Cristian Sminchisescu,et al.  Deep Multitask Architecture for Integrated 2D and 3D Human Sensing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[29]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Pascal Fua,et al.  Flight Dynamics-Based Recovery of a UAV Trajectory Using Ground Cameras , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Mohammed Bennamoun,et al.  A New Representation of Skeleton Sequences for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Fei Wang,et al.  A multi-UAV cooperative route planning methodology for 3D fine-resolution building model reconstruction , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[33]  Wei Liu,et al.  Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[34]  Yiyi Liao,et al.  Deep Marching Cubes: Learning Explicit Surface Representations , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Mathieu Aubry,et al.  A Papier-Mache Approach to Learning 3D Surface Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Zhengfang Duanmu,et al.  End-to-End Blind Image Quality Assessment Using Deep Neural Networks , 2018, IEEE Transactions on Image Processing.

[37]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Anders P. Eriksson,et al.  Deep Level Sets: Implicit Surface Representations for 3D Shape Inference , 2019, ArXiv.

[39]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Thomas Brox,et al.  What Do Single-View 3D Reconstruction Networks Learn? , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).