Deep fusion of multi-view and multimodal representation of ALS point cloud for 3D terrain scene recognition

Abstract Terrain scene category is useful not only for some geographical or environmental researches, but also for choosing suitable algorithms or proper parameters of the algorithms for several point cloud processing tasks to achieve better performance. However, there are few studies in point cloud processing focusing on terrain scene classification at present. In this paper, a novel deep learning framework for 3D terrain scene recognition using 2D representation of sparse point cloud is proposed. The framework has two key components. (1) Initially, several suitable discriminative low-level local features are extracted from airborne laser scanning point cloud, and 3D terrain scene is encoded into multi-view and multimodal 2D representation. (2) A two-level fusion network embedded with feature- and decision-level fusion strategy is designed to fully exploit the 2D representation of 3D terrain scene, which can be trained end-to-end. Experiment results show that our method achieves an overall accuracy of 96.70% and a kappa coefficient of 0.96 in recognizing nine categories of terrain scene point clouds. Extensive design choices of the underlying framework are tested, and other typical methods from literature for related research are compared.

[1]  Zhichao Zhou,et al.  DeepPano: Deep Panoramic Representation for 3-D Shape Recognition , 2015, IEEE Signal Processing Letters.

[2]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[3]  Xin Li,et al.  Poisson disk sampling in geodesic metric for DEM simplification , 2013, Int. J. Appl. Earth Obs. Geoinformation.

[4]  Marc Pollefeys,et al.  Semantic3D.net: A new Large-scale Point Cloud Classification Benchmark , 2017, ArXiv.

[5]  Markus Gerke,et al.  The ISPRS benchmark on urban object classification and 3D building reconstruction , 2012 .

[6]  Ruofei Zhong,et al.  Classification of Urban Point Clouds: A Robust Supervised Approach With Automatically Generating Training Data , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[7]  Teng Wu,et al.  Fast and Accurate Plane Segmentation of Airborne LiDAR Point Cloud Using Cross-Line Elements , 2016, Remote. Sens..

[8]  W. Hood,et al.  Landscape allometry and prediction in estuarine ecology: Linking landform scaling to ecological patterns and processes , 2007 .

[9]  Y. Deng,et al.  New trends in digital terrain analysis: landform definition, representation, and classification , 2007 .

[10]  Xiangyun Hu,et al.  Deep-Learning-Based Classification for DTM Extraction from ALS Point Cloud , 2016, Remote. Sens..

[11]  Jie Shan,et al.  Building roof modeling from airborne laser scanning data based on level set approach , 2011 .

[12]  George Vosselman,et al.  Experimental comparison of filter algorithms for bare-Earth extraction from airborne laser scanning point clouds , 2004 .

[13]  Yang Liu,et al.  O-CNN , 2017, ACM Trans. Graph..

[14]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2015, IEEE Trans. Pattern Anal. Mach. Intell..