Unsupervised Multi-Task Feature Learning on Point Clouds

We introduce an unsupervised multi-task model to jointly learn point and shape features on point clouds. We define three unsupervised tasks including clustering, reconstruction, and self-supervised classification to train a multi-scale graph-based encoder. We evaluate our model on shape classification and segmentation benchmarks. The results suggest that it outperforms prior state-of-the-art unsupervised models: In the ModelNet40 classification task, it achieves an accuracy of 89.1% and in ShapeNet segmentation task, it achieves an mIoU of 68.2 and accuracy of 88.6%.

[1]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Yue Wang,et al.  PointGrow: Autoregressively Learned Point Cloud Generation with Self-Attention , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[3]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Sergio Guadarrama,et al.  Tracking Emerges by Colorizing Videos , 2018, ECCV.

[5]  Nassir Navab,et al.  Fully-Convolutional Point Networks for Large-Scale Point Clouds , 2018, ECCV.

[6]  Daniel Cremers,et al.  Clustering with Deep Learning: Taxonomy and New Methods , 2018, ArXiv.

[7]  Dhruv Batra,et al.  Joint Unsupervised Learning of Deep Representations and Image Clusters , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[9]  Subhransu Maji,et al.  Multiresolution Tree Networks for 3D Point Cloud Processing , 2018, ECCV.

[10]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Qiang Liu,et al.  A Survey of Clustering With Deep Learning: From the Perspective of Network Architecture , 2018, IEEE Access.

[12]  Jayakorn Vongkulbhisal,et al.  Inverse Composition Discriminative Optimization for Point Cloud Registration , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Daniel Cohen-Or,et al.  EC-Net: an Edge-aware Point set Consolidation Network , 2018, ECCV.

[15]  Yong Luo,et al.  Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification , 2019, IEEE Transactions on Image Processing.

[16]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Bo Yang,et al.  Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering , 2016, ICML.

[18]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[19]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[20]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[21]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Paolo Favaro,et al.  Self-Supervised Feature Learning by Learning to Spot Artifacts , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Laurens van der Maaten,et al.  3D Semantic Segmentation with Submanifold Sparse Convolutional Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[25]  Nico Blodow,et al.  Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[26]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[27]  Wei Wu,et al.  PointCNN: convolution on Χ -transformed points , 2018, NIPS 2018.

[28]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Matthijs Douze,et al.  Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[30]  Raif M. Rustamov,et al.  Laplace-Beltrami eigenfunctions for deformation invariant shape representation , 2007 .

[31]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Michael Felsberg,et al.  Density Adaptive Point Set Registration , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Timo Ropinski,et al.  Monte Carlo convolution for learning on non-uniformly sampled point clouds , 2018, ACM Trans. Graph..

[34]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Mathieu Aubry,et al.  A Papier-Mache Approach to Learning 3D Surface Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Ramakanth Pasunuru,et al.  Multi-Task Video Captioning with Video and Entailment Generation , 2017, ACL.

[37]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[38]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[39]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[40]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[41]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[42]  Chia-Wen Lin,et al.  CNN-Based Joint Clustering and Representation Learning with Feature Drift Compensation for Large-Scale Image Data , 2017, IEEE Transactions on Multimedia.

[43]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[44]  Andrew Zisserman,et al.  Learning and Using the Arrow of Time , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Leonidas J. Guibas,et al.  Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[46]  Kaleem Siddiqi,et al.  Local Spectral Graph Convolution for Point Set Feature Learning , 2018, ECCV.

[47]  Anath Fischer,et al.  3D Point Cloud Registration for Localization Using a Deep Neural Network Auto-Encoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Nicu Sebe,et al.  Egocentric Daily Activity Recognition via Multitask Clustering , 2015, IEEE Transactions on Image Processing.

[49]  Daniel Cremers,et al.  The wave kernel signature: A quantum mechanical approach to shape analysis , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[50]  Leonidas J. Guibas,et al.  SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Naveed Akhtar,et al.  Spherical Convolutional Neural Network for 3D Point Clouds , 2018, ArXiv.

[52]  Peter Wonka,et al.  PolyFit: Polygonal Surface Reconstruction from Point Clouds , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[53]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[54]  Ronen Basri,et al.  SpectralNet: Spectral Clustering using Deep Neural Networks , 2018, ICLR.

[55]  Daniel Müllner,et al.  Modern hierarchical, agglomerative clustering algorithms , 2011, ArXiv.

[56]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[58]  Binh-Son Hua,et al.  Pointwise Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[59]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[60]  Jitendra Malik,et al.  Learning to See by Moving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[61]  Amin Zheng,et al.  RGCNN: Regularized Graph CNN for Point Cloud Segmentation , 2018, ACM Multimedia.

[62]  Cheng Deng,et al.  Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[63]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[64]  Raquel Urtasun,et al.  Deep Parametric Continuous Convolutional Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[65]  Yong Jae Lee,et al.  Cross-Domain Self-Supervised Multi-task Feature Learning Using Synthetic Imagery , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[66]  Ulrich Neumann,et al.  Recurrent Slice Networks for 3D Segmentation of Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[67]  Horst Bischof,et al.  Scalable Surface Reconstruction from Point Clouds with Extreme Scale and Density Diversity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Daniel Cohen-Or,et al.  PU-Net: Point Cloud Upsampling Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[69]  Leonidas J. Guibas,et al.  A concise and provably informative multi-scale signature based on heat diffusion , 2009 .

[70]  Nico Blodow,et al.  Aligning point cloud views using persistent feature histograms , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[71]  Abhinav Gupta,et al.  Unsupervised Learning of Visual Representations Using Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[72]  Fuhua Cheng,et al.  Surface Reconstruction from Point Clouds , 1998, SSM.

[73]  Oliver Grau,et al.  VConv-DAE: Deep Volumetric Shape Learning Without Object Labels , 2016, ECCV Workshops.

[74]  Leonidas J. Guibas,et al.  A scalable active framework for region annotation in 3D shape collections , 2016, ACM Trans. Graph..

[75]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[76]  Andrew Zisserman,et al.  Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[77]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[78]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[79]  Federico Tombari,et al.  3D Point Capsule Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  Iasonas Kokkinos,et al.  Scale-invariant heat kernel signatures for non-rigid shape recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[81]  C. Qi Deep Learning on Point Sets for 3 D Classification and Segmentation , 2016 .

[82]  Matthias Zwicker,et al.  Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network , 2018, AAAI.

[83]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[84]  Slobodan Ilic,et al.  PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors , 2018, ECCV.

[85]  Wei Zeng,et al.  3DContextNet: K-d Tree Guided Hierarchical Learning of Point Clouds Using Local and Global Contextual Cues , 2017, ECCV Workshops.

[86]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[87]  Bastian Leibe,et al.  Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[88]  Yifan Xu,et al.  SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters , 2018, ECCV.

[89]  Junsong Yuan,et al.  Hand PointNet: 3D Hand Pose Estimation Using Point Sets , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[90]  Edmond Boyer,et al.  FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[91]  Subhransu Maji,et al.  SPLATNet: Sparse Lattice Networks for Point Cloud Processing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[92]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[93]  Nico Blodow,et al.  Towards 3D Point cloud based object maps for household environments , 2008, Robotics Auton. Syst..

[94]  Zhuo Chen,et al.  Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[95]  Dong Tian,et al.  Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[96]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[97]  Dong Tian,et al.  FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[98]  Jiamao Li,et al.  3D Recurrent Neural Networks with Context Fusion for Point Cloud Semantic Segmentation , 2018, ECCV.