POIRot: A rotation invariant omni-directional pointnet

Point-cloud is an efficient way to represent 3D world. Analysis of point-cloud deals with understanding the underlying 3D geometric structure. But due to the lack of smooth topology, and hence the lack of neighborhood structure, standard correlation can not be directly applied on point-cloud. One of the popular approaches to do point correlation is to partition the point-cloud into voxels and extract features using standard 3D correlation. But this approach suffers from sparsity of point-cloud and hence results in multiple empty voxels. One possible solution to deal with this problem is to learn a MLP to map a point or its local neighborhood to a high dimensional feature space. All these methods suffer from a large number of parameters requirement and are susceptible to random rotations. A popular way to make the model "invariant" to rotations is to use data augmentation techniques with small rotations but the potential drawback includes \item more training samples \item susceptible to large rotations. In this work, we develop a rotation invariant point-cloud segmentation and classification scheme based on the omni-directional camera model (dubbed as {\bf POIRot$^1$}). Our proposed model is rotationally invariant and can preserve geometric shape of a 3D point-cloud. Because of the inherent rotation invariant property, our proposed framework requires fewer number of parameters (please see \cite{Iandola2017SqueezeNetAA} and the references therein for motivation of lean models). Several experiments have been performed to show that our proposed method can beat the state-of-the-art algorithms in classification and part segmentation applications.

[1]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2]  Lei Zhou,et al.  Learning and Matching Multi-View Descriptors for Registration of Point Clouds , 2018, ECCV.

[3]  Kurt Keutzer,et al.  SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[5]  Peter V. Gehler,et al.  Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  S. Helgason Differential Geometry, Lie Groups, and Symmetric Spaces , 1978 .

[7]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[9]  Rudrasis Chakraborty,et al.  A CNN for homogneous Riemannian manifolds with applications to Neuroimaging , 2018, 1805.05487.

[10]  Andrew Adams,et al.  Fast High‐Dimensional Filtering Using the Permutohedral Lattice , 2010, Comput. Graph. Forum.

[11]  Subhransu Maji,et al.  SPLATNet: Sparse Lattice Networks for Point Cloud Processing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Liqing Zhang,et al.  Tensor Ring Decomposition , 2016, ArXiv.

[14]  Max Welling,et al.  Convolutional Networks for Spherical Signals , 2017, ArXiv.

[15]  Cheng-Hung Lin,et al.  A novel campus navigation APP with augmented reality and deep learning , 2018, 2018 IEEE International Conference on Applied System Invention (ICASI).

[16]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[17]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[18]  Jitendra Malik,et al.  Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[20]  Matthias Nießner,et al.  ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Shaojie Shen,et al.  Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving , 2018, ECCV.

[25]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[27]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[28]  A. N. Rajagopalan,et al.  Occlusion-Aware Rolling Shutter Rectification of 3D Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  E. Hobson The Theory of Spherical and Ellipsoidal Harmonics , 1955 .

[31]  Rudrasis Chakraborty,et al.  H-CNNs: Convolutional Neural Networks for Riemannian Homogeneous Spaces , 2018, ArXiv.

[32]  Martin Simonovsky,et al.  Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Yang Liu,et al.  O-CNN , 2017, ACM Trans. Graph..

[34]  L. Deng,et al.  The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] , 2012, IEEE Signal Processing Magazine.

[35]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  R. Buckner,et al.  Normative estimates of cross-sectional and longitudinal brain volume decline in aging and AD , 2005, Neurology.

[37]  Anand Rangarajan,et al.  Simultaneous Nonrigid Registration of Multiple Point Sets and Atlas Construction , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Subhransu Maji,et al.  Multiresolution Tree Networks for 3D Point Cloud Processing , 2018, ECCV.

[39]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[40]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[41]  Nassir Navab,et al.  Fully-Convolutional Point Networks for Large-Scale Point Clouds , 2018, ECCV.

[42]  Cewu Lu,et al.  LiDAR-Video Driving Dataset: Learning Driving Policies Effectively , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Alain Pagani,et al.  Learning to Fuse: A Deep Learning Approach to Visual-Inertial Camera Pose Estimation , 2016, 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[44]  Subhransu Maji,et al.  3D Shape Segmentation with Projective Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).