Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline

Processing point cloud data is an important component of many real-world systems. As such, a wide variety of point-based approaches have been proposed, reporting steady benchmark improvements over time. We study the key ingredients of this progress and uncover two critical results. First, we find that auxiliary factors like different evaluation schemes, data augmentation strategies, and loss functions, which are independent of the model architecture, make a large difference in performance. The differences are large enough that they obscure the effect of architecture. When these factors are controlled for, PointNet++, a relatively older network, performs competitively with recent methods. Second, a very simple projection-based method, which we refer to as SimpleView, performs surprisingly well. It achieves on par or better results than sophisticated state-of-the-art methods on ModelNet40 while being half the size of PointNet++. It also outperforms state-of-the-art methods on ScanObjectNN, a real-world point cloud benchmark, and demonstrates better cross-dataset generalization. Code is available at https://github.com/ princeton-vl/SimpleView.

[1]  Didier Stricker,et al.  Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks , 2018, ECCV.

[2]  Ming Hao,et al.  Linked Dynamic Graph CNN: Learning on Point Cloud via Linking Hierarchical Features , 2019, ArXiv.

[3]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yifan Xu,et al.  SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters , 2018, ECCV.

[5]  Jing Hua,et al.  A-CNN: Annularly Convolutional Neural Networks on Point Clouds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yue Gao,et al.  GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Zoltan-Csaba Marton,et al.  Tutorial: Point Cloud Library: Three-Dimensional Object Recognition and 6 DOF Pose Estimation , 2012, IEEE Robotics & Automation Magazine.

[9]  Nico Blodow,et al.  Close-range scene segmentation and reconstruction of 3D point cloud maps for mobile manipulation in domestic environments , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Subhransu Maji,et al.  3D Shape Segmentation with Projective Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Oliver Brock,et al.  Analysis and Observations From the First Amazon Picking Challenge , 2016, IEEE Transactions on Automation Science and Engineering.

[12]  Yue Gao,et al.  Multi-View 3D Object Retrieval With Deep Embedding Network , 2016, IEEE Transactions on Image Processing.

[13]  Christoph Mertz,et al.  Pedestrian Detection and Tracking Using Three-dimensional LADAR Data , 2010, Int. J. Robotics Res..

[14]  Sainan Liu,et al.  Attentional ShapeContextNet for Point Cloud Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Subhransu Maji,et al.  SPLATNet: Sparse Lattice Networks for Point Cloud Processing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Georg Heigold,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[17]  Svetlana Lazebnik,et al.  Superparsing , 2010, International Journal of Computer Vision.

[18]  C. Qi Deep Learning on Point Sets for 3 D Classification and Segmentation , 2016 .

[19]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[20]  Fuxin Li,et al.  PointConv: Deep Convolutional Networks on 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Yue Gao,et al.  Inductive Multi-Hypergraph Learning and Its Application on View-Based 3D Object Classification , 2018, IEEE Transactions on Image Processing.

[22]  Matthias Zwicker,et al.  Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network , 2018, AAAI.

[23]  Jiwen Lu,et al.  DensePoint: Learning Densely Contextual Representation for Efficient Point Cloud Processing , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Mohammed Bennamoun,et al.  Deep Learning for 3D Point Clouds: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Shiming Xiang,et al.  Relation-Shape Convolutional Neural Network for Point Cloud Analysis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Yaron Lipman,et al.  Point convolutional neural networks by extension operators , 2018, ACM Trans. Graph..

[28]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[29]  Jun Miura,et al.  Pedestrian Recognition Using High-definition LIDAR , 2011 .

[30]  Matthias Nießner,et al.  3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation , 2018, ECCV.

[31]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[32]  Michael Felsberg,et al.  Deep Projective 3D Semantic Segmentation , 2017, CAIP.

[33]  Xiangyun Hu,et al.  Deep fusion of multi-view and multimodal representation of ALS point cloud for 3D terrain scene recognition , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[34]  Dieter Fox,et al.  6-DOF GraspNet: Variational Grasp Generation for Object Manipulation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Yue Gao,et al.  Hypergraph Neural Networks , 2018, AAAI.

[36]  Nikos Komodakis,et al.  Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Jianxiong Xiao,et al.  Sliding Shapes for 3D Object Detection in Depth Images , 2014, ECCV.

[38]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Vladlen Koltun,et al.  Tangent Convolutions for Dense Prediction in 3D , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Jianxiong Xiao,et al.  Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Shuguang Cui,et al.  PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adaptive Sampling , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Wolfram Burgard,et al.  Using Boosted Features for the Detection of People in 2D Range Data , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[45]  Jing Huang,et al.  Point cloud labeling using 3D Convolutional Neural Network , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[46]  Sing Bing Kang,et al.  Revealing Scenes by Inverting Structure From Motion Reconstructions , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[48]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[49]  Kaleem Siddiqi,et al.  Local Spectral Graph Convolution for Point Set Feature Learning , 2018, ECCV.

[50]  Patrick Wieschollek,et al.  Flex-Convolution - Million-Scale Point-Cloud Learning Beyond Grid-Worlds , 2018, ACCV.

[51]  Marc Pollefeys,et al.  Semantic3D.net: A new Large-scale Point Cloud Classification Benchmark , 2017, ArXiv.

[52]  Junsong Yuan,et al.  Multi-view Harmonized Bilinear Network for 3D Object Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Kan Chen,et al.  Billion-scale semi-supervised learning for image classification , 2019, ArXiv.

[54]  Silvio Savarese,et al.  SEGCloud: Semantic Segmentation of 3D Point Clouds , 2017, 2017 International Conference on 3D Vision (3DV).

[55]  Yasuyuki Matsushita,et al.  RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[57]  Leonidas J. Guibas,et al.  KPConv: Flexible and Deformable Convolution for Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[58]  Philip H. S. Torr,et al.  What, Where and How Many? Combining Object Detectors and CRFs , 2010, ECCV.

[59]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[60]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[61]  Duc Thanh Nguyen,et al.  Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[62]  Zhichao Zhou,et al.  DeepPano: Deep Panoramic Representation for 3-D Shape Recognition , 2015, IEEE Signal Processing Letters.

[63]  Markus H. Gross,et al.  A Network Architecture for Point Cloud Classification via Automatic Depth Images Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[64]  Anath Fischer,et al.  3DmFV: Three-Dimensional Point Cloud Classification in Real-Time Using Convolutional Neural Networks , 2018, IEEE Robotics and Automation Letters.

[65]  Alexandre Boulch,et al.  Unstructured Point Cloud Semantic Labeling Using Deep Segmentation Networks , 2017, 3DOR@Eurographics.