DAR-Net: Dynamic Aggregation Network for Semantic Scene Segmentation

Traditional grid/neighbor-based static pooling has become a constraint for point cloud geometry analysis. In this paper, we propose DAR-Net, a novel network architecture that focuses on dynamic feature aggregation. The central idea of DAR-Net is generating a self-adaptive pooling skeleton that considers both scene complexity and local geometry features. Providing variable semi-local receptive fields and weights, the skeleton serves as a bridge that connect local convolutional feature extractors and a global recurrent feature integrator. Experimental results on indoor scene datasets show advantages of the proposed approach compared to state-of-the-art architectures that adopt static pooling methods.

[1]  Björn E. Ottersten,et al.  RGB-D Multi-view System Calibration for Full 3D Scene Reconstruction , 2014, 2014 22nd International Conference on Pattern Recognition.

[2]  Christophe Bobda,et al.  R-Covnet: Recurrent Neural Convolution Network for 3D Object Recognition , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[3]  Subhransu Maji,et al.  3D Shape Reconstruction from Sketches via Multi-view Convolutional Networks , 2017, 2017 International Conference on 3D Vision (3DV).

[4]  Leonidas J. Guibas,et al.  TextureNet: Consistent Local Parametrizations for Learning From High-Resolution Signals on Meshes , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Matthias Nießner,et al.  3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation , 2018, ECCV.

[6]  Daniel Cohen-Or,et al.  PointWise: An Unsupervised Point-wise Feature Learning Network , 2019, ArXiv.

[7]  Silvio Savarese,et al.  3D Semantic Parsing of Large-Scale Indoor Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Ulrich Neumann,et al.  Recurrent Slice Networks for 3D Segmentation of Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jonathan Masci,et al.  Learning shape correspondence with anisotropic convolutional neural networks , 2016, NIPS.

[11]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jiwen Lu,et al.  3DCNN-DQN-RNN: A Deep Reinforcement Learning Framework for Semantic Parsing of Large-Scale 3D Point Clouds , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[16]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Dong Tian,et al.  FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Jiamao Li,et al.  3D Recurrent Neural Networks with Context Fusion for Point Cloud Semantic Segmentation , 2018, ECCV.

[19]  Ingmar Posner,et al.  Voting for Voting in Online Point Cloud Object Detection , 2015, Robotics: Science and Systems.

[20]  Vladlen Koltun,et al.  Tangent Convolutions for Dense Prediction in 3D , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Jing Huang,et al.  Point cloud labeling using 3D Convolutional Neural Network , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[23]  Yves Lechevallier,et al.  Clustering Large, Multi-level Data Sets: An Apporach Based on Kohonen Self Organizing Maps , 2000, PKDD.

[24]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[25]  Bernard Ghanem,et al.  MortonNet: Self-Supervised Learning of Local Features in 3D Point Clouds , 2019, ArXiv.

[26]  Daniel Cremers,et al.  FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-Based CNN Architecture , 2016, ACCV.

[27]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[28]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Silvio Savarese,et al.  SEGCloud: Semantic Segmentation of 3D Point Clouds , 2017, 2017 International Conference on 3D Vision (3DV).

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.