One Point is All You Need: Directional Attention Point for Feature Learning

We present a novel attention-based mechanism for learning enhanced point features for tasks such as point cloud classification and segmentation. Our key message is that if the right attention point is selected, then "one point is all you need" -- not a sequence as in a recurrent model and not a pre-selected set as in all prior works. Also, where the attention point is should be learned, from data and specific to the task at hand. Our mechanism is characterized by a new and simple convolution, which combines the feature at an input point with the feature at its associated attention point. We call such a point a directional attention point (DAP), since it is found by adding to the original point an offset vector that is learned by maximizing the task performance in training. We show that our attention mechanism can be easily incorporated into state-of-the-art point cloud classification and segmentation networks. Extensive experiments on common benchmarks such as ModelNet40, ShapeNetPart, and S3DIS demonstrate that our DAP-enabled networks consistently outperform the respective original networks, as well as all other competitive alternatives, including those employing pre-selected sets of attention points.

[1]  Hongbo Fu,et al.  JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds , 2020, ECCV.

[2]  Leonidas J. Guibas,et al.  A scalable active framework for region annotation in 3D shape collections , 2016, ACM Trans. Graph..

[3]  Chunxia Xiao,et al.  PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Shiming Xiang,et al.  Relation-Shape Convolutional Neural Network for Point Cloud Analysis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Leonidas J. Guibas,et al.  KPConv: Flexible and Deformable Convolution for Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Antonios Tsourdos,et al.  GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point Cloud , 2019, ArXiv.

[7]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[8]  J. Theeuwes,et al.  Top-down versus bottom-up attentional control: a failed theoretical dichotomy , 2012, Trends in Cognitive Sciences.

[9]  Silvio Savarese,et al.  3D Semantic Parsing of Large-Scale Indoor Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Zheng Zhang,et al.  A Closer Look at Local Aggregation Operators in Point Cloud Analysis , 2020, ECCV.

[11]  Xiao Liu,et al.  Fully Convolutional Attention Networks for Fine-Grained Recognition , 2016 .

[12]  Shuguang Cui,et al.  PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adaptive Sampling , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Silvio Savarese,et al.  4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Bingbing Ni,et al.  Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Li Liu,et al.  Deep Learning for 3D Point Clouds: A Survey , 2020, IEEE transactions on pattern analysis and machine intelligence.

[16]  Ming Yang,et al.  3D Graph Embedding Learning with a Structure-aware Loss Function for Point Cloud Semantic Instance Segmentation , 2019, ArXiv.

[17]  Jing Hua,et al.  A-CNN: Annularly Convolutional Neural Networks on Point Clouds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Alexandre Boulch ConvPoint: Continuous convolutions for point cloud processing , 2020, Comput. Graph..

[19]  Vladlen Koltun,et al.  Tangent Convolutions for Dense Prediction in 3D , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Yaron Lipman,et al.  Point convolutional neural networks by extension operators , 2018, ACM Trans. Graph..

[21]  Ronald A. Rensink The Dynamic Representation of Scenes , 2000 .

[22]  Jiwen Lu,et al.  DensePoint: Learning Densely Contextual Representation for Efficient Point Cloud Processing , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Sainan Liu,et al.  Attentional ShapeContextNet for Point Cloud Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Raquel Urtasun,et al.  Deep Parametric Continuous Convolutional Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Joachim Weickert,et al.  Anisotropic diffusion in image processing , 1996 .

[26]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Qing Li,et al.  Point2Node: Correlation Learning of Dynamic-Node for Point Cloud Feature Modeling , 2019, AAAI.

[28]  Yang Liu,et al.  O-CNN , 2017, ACM Trans. Graph..

[29]  Le Hui,et al.  Cascaded Non-local Neural Network for Point Cloud Semantic Segmentation , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Yanbo Ma,et al.  Scanet: Spatial-channel Attention Network for 3D Object Detection , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  C. Qi Deep Learning on Point Sets for 3 D Classification and Segmentation , 2016 .

[32]  Dan Xu,et al.  Dynamic Graph Message Passing Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Matthias Zwicker,et al.  Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network , 2018, AAAI.

[34]  A. Baudes,et al.  A Nonlocal Algorithm for Image Denoising , 2005, CVPR 2005.

[35]  Chi-Wing Fu,et al.  Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[37]  Hao Su,et al.  Multi-View PointNet for 3D Scene Understanding , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[38]  Xiaopeng Li,et al.  Not All Attention Is Needed: Gated Attention Network for Sequence Data , 2019, AAAI.

[39]  Wei Wu,et al.  PointCNN: convolution on Χ -transformed points , 2018, NIPS 2018.

[40]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Fuxin Li,et al.  PointConv: Deep Convolutional Networks on 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Yuxin Peng,et al.  Object-Part Attention Model for Fine-Grained Image Classification , 2017, IEEE Transactions on Image Processing.

[44]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[46]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[47]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[48]  Jonathan Sauder,et al.  Self-Supervised Deep Learning on Point Clouds by Reconstructing Space , 2019, NeurIPS.

[49]  Chi-Wing Fu,et al.  PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Xianzhi Li,et al.  PointAugment: An Auto-Augmentation Framework for Point Cloud Classification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[53]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[54]  Martin Simonovsky,et al.  Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Yifan Xu,et al.  SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters , 2018, ECCV.