DERIVING A DTM FROM A DSM BY UNIFORM REGIONS AND CONTEXT

A Digital Terrain Model (DTM) is typically used in applications such as environment planning, flood risk evaluation and building detection. A Digital Surface Model (DSM) is usually acquired by LASER scanning or from stereoscopic pairs of aerial or satellite images. By contrast, a direct DTM acquisition traditionally requires more effort on the terrain or for scene object classification and the automatic derivation of a DTM from a DSM still faces difficulties. We propose in this paper a new DTM from DSM approach that consists of three steps: DSM region segmentation, region selection and height interpolation. First the DSM is segmented into regions of limited slope. A gradient filter is applied to the DSM raster to highlight height transitions. A connected component algorithm labels the different regions separated by those transitions. Secondly, regions whose perimeter is on the average higher than their surrounding are discarded. Finally, a hierarchical interpolation procedure fills the holes in the DSM due to high gradients or discarded regions. The proposed algorithm developed for urban areas has been applied to the Vaihingen dataset of the ISPRS benchmark and qualitatively validated by the results of its independent evaluation procedure for building detection. In the context of change detection for database revision, this new DTM extraction was applied to an area in Brussels for which digital aerial imagery and LIDAR measurements are available. This reference data allowed for some quantitative evaluation of the DTM errors. INTRODUCTION A Digital Terrain Model (DTM) is typically used in applications such as environment planning, flood risk evaluation, image ortho-rectification for map creation and building detection, to name a few. A common approach for building detection exploits the difference between a DSM (Digital Surface Model) and a DTM. This difference, called normalized DSM (nDSM = DSM-DTM), is the local elevation which can be thresholded to detect building candidates. A DSM is usually acquired from LIDAR measurements or by photogrammetry with aerial or satellite stereo imagery. By contrast, DTM acquisition traditionally requires more effort due to the very slow procedure of terrain survey or the need for classification between ground and non-ground pixels. The derivation of a DTM from a DSM is attractive but still faces difficulties related to the quality of the DSM (1), the ambiguity of some scene objects (like bridges) and the difficulty to classify offterrain points (2). To derive a DTM from a DSM, off-terrain pixels have to be filtered out. A large bibliographic source is available from the LIDAR ground filtering research. In the review of Meng et al. (2), 6 categories of methods are identified. The book chapter written by Pfeifer about LIDAR data filtering and DTM generation (3) is more synthetic and lists 4 categories (morphological filters, progressive densification, surface-based filters and segmentation-based filters). In our previous development (4) the DTM value assigned to a pixel was derived from the percentile 10 of the histogram of DSM values in a square neighbourhood of 50 × 50 m around the pixel. This method corresponding to the morphological category gave satisfying results in case of small size buildings or woods and where the terrain is rather flat. We propose in this paper a new DTM from DSM approach that consists of three steps: DSM region segmentation, region selection and height interpolation, all detailed in the next section. Several DOI: 10.12760/01-2015-1-02 EARSeL eProceedings 14, 1/2015 17 papers described approaches based on segmentation, category 4 of Pfeifer’s classification. PérezGarcia et al. (5) used the distance to a local plane to segment and guide ground point densification. Yuan et al. (6) exploited an intensity image to perform region growing. Yan et al. (7) segmented the DSM with a local slope constraint. Regions higher than their neighbours are filtered out. Compared to these works, our approach does not require seeds for DSM segmentation. All the regions are detected in two DSM raster scans rather than individually and successively for each seed. The method was designed for urban areas with high resolution imagery. It was qualitatively evaluated for building detection in the context of the ISPRS test on urban classification. A more quantitative evaluation was performed in the framework of our cooperation with the Belgian National Geographical Institute (NGI) where the whole process of DSM acquisition and DTM derivation was compared against LIDAR measures, owned by CIRB (Centre d'Informatique pour la Région Bruxelloise). METHODOLOGY To derive a DTM from a DSM, ground pixels have to be identified. Automatic pixel classification into terrain and non-terrain is not obvious if no land cover database is available. Vegetation is a sensitive issue, as the distinction between grass (terrain) and trees (non-terrain) is often ambiguous and is further complicated by the recent trend to add vegetation on roofs. Figure 1: DTM from DSM: DSM for Vaihingen data (a); DSM gradient (b); Large regions with small gradients (c); Regions, except those locally elevated (d); Sparse DTM from DSM values (e); Full DTM from interpolation (f). EARSeL eProceedings 14, 1/2015 18 For the above reasons, we gave preference to geometrical rules. The constraints on size, slope and neighbouring height differences are easy to derive from the sole DSM and geometrical rules are intuitive and implemented with understandable parameters. Our new DTM from DSM approach consists of three steps: DSM segmentation into uniform regions, selection of terrain regions and interpolation between terrain regions. DSM segmentation The DSM is first segmented into regions of limited slope. The gradient of the DSM raster is computed to highlight height transitions (Figure 1b). The gradient threshold is typically around 0.4 (25° from horizontal) as a compromise to get sufficiently large segments while limiting the variation within each of them. A two-pass connected-component labelling algorithm, similar to the one described and analysed in Walczyk et al. (8) for real-time video processing, identifies the different regions separated by the transitions (Figure 1c). This labelling algorithm is particularly fast, as it only scans twice the DSM raster with very few pixel accesses (globally 9 per pixel). Since they are likely to be less reliable, small regions are filtered out thanks to the label pixel counts obtained in one raster scan. Region filtering Secondly, the regions identified in the first step whose perimeter is on the average higher than their surrounding are discarded. A global image transform is performed to highlight valid regions. The difference between the original DSM and a blurred version (with a 4 × 4 m box filter) reveals large differences D at region borders (Figure 2b), where the DSM slope is strong by construction (see previous subsection). The sign of the difference is dependent on the relative height of the neighbouring regions. Each region is assigned a count PD for positive (D > 2 m) and a count ND for negative (D < -2 m) pixel difference D. A region is rejected, if its count PD exceeds half of its count ND. Figure 1d shows kept regions, corresponding mainly to the road network, squares and gardens. Figure 2: Region filtering. Uniform regions, in false colours (a); DSM transformation to highlight higher (in green) and lower (in red) border regions (b); Lower regions of DSM (c). Height interpolation Finally, a hierarchical interpolation procedure fills empty areas corresponding to high gradients or discarded regions. EARSeL eProceedings 14, 1/2015 19 The DTM values are first initialised with DSM values in retained regions (Figure 1e). This DTM with holes is used to construct a multi-resolution pyramid with a scale decrement equal to 2, thanks to a recursive lowpass filtering and subsampling to create each level (Figure 3a). The recursion stops when a level with no hole is created. Then the pyramid is traversed in the other direction, towards the higher resolution. Starting from the penultimate lowest resolution, pixels with no DTM value receive the interpolated values from opposite direct neighbours (8-connectivity) at the same level or, if there are none, from the corresponding neighbours in the directly lower resolution. The holes in the DTM are thus progressively filled from lower to higher resolutions, until the initial DSM resolution is reached. Figure 3: Image pyramid for DTM interpolation. Pyramid construction towards lower resolution (a); Pyramid filling from lower resolution (b). This DTM from DSM approach is particularly well suited to very high resolution imagery (0.1m), since the segmented regions are sufficiently large. In 1 m imagery, small objects like little streets, terraces and small gardens have little chance to be retained as ground surfaces. However, even at these scales, small objects like cars or garden huts may be erroneously kept in the DTM so that post filtering could improve results. These artefacts are of limited amplitude and normally do not penalize applications such as building detection. RESULTS The proposed algorithm was applied to the Vaihingen dataset of the ISPRS benchmark and was qualitatively validated by the results of its independent evaluation procedure for building detection. In the context of change detection for database revision for the Belgian National Geographic Institute, this new DTM extraction was applied to digital aerial imagery of Brussels (GSD=7.5cm), for which ground truth values from LIDAR are available. This reference data allowed for some quantitative evaluation of the DTM errors. EARSeL eProceedings 14, 1/2015 20 Vaihingen benchmark The data used for development and test consists of the Vaihingen set of the ISPRS benchmark (9). This set contains CIR aerial images (Red, Green and Near Infrared channels), a derived DSM map and an ortho-rectif