Comparing three approaches of spatial disaggregation of legacy soil maps based on DSMART algorithm

Abstract. Enhancing the spatial resolution of pedological information is a great challenge in the field of Digital Soil Mapping (DSM). Several techniques have emerged to disaggregate conventional soil maps initially available at coarser spatial resolution than required for solving environmental and agricultural issues. At the regional level, polygon maps represent soil cover as a tessellation of polygons defining Soil Map Units (SMU), where each SMU can include one or several Soil Type Units (STU) with given proportions derived from expert knowledge. Such polygon maps can be disaggregated at finer spatial resolution by machine learning algorithms using the Disaggregation and Harmonisation of Soil Map Units Through Resampled Classification Trees (DSMART) algorithm. This study aimed to compare three approaches of spatial disaggregation of legacy soil maps based on DSMART decision trees to test the hypothesis that the disaggregation of soil landscape distribution rules may improve the accuracy of the resulting soil maps. Overall, two modified DSMART algorithm (DSMART with extra soil profiles, DSMART with soil landscape relationships) and the original DSMART algorithm were tested. The quality of disaggregated soil maps at 50 m resolution was assessed over a large study area (6775 km2) using an external validation based on independent 135 soil profiles selected by probability sampling, 755 legacy soil profiles and existing detailed 1 : 25 000 soil maps. Pairwise comparisons were also performed, using Shannon entropy measure, to spatially locate differences between disaggregated maps. The main results show that adding soil landscape relationships in the disaggregation process enhances the performance of prediction of soil type distribution. Considering the three most probable STU and using 135 independent soil profiles, the overall accuracy measures are: 19.8 % for DSMART with expert rules against 18.1 % for the original DSMART and 16.9 % for DSMART with extra soil profiles. These measures were almost twofold higher when validated using 3 × 3 windows. They achieved 28.5 % for DSMART with soil landscape relationships, 25.3 % and 21 % for original DSMART and DSMART with extra soil observations, respectively. In general, adding soil landscape relationships as well as extra soil observations constraints the model to predict a specific STU that can occur in specific environmental conditions. Thus, including global soil landscape expert rules in the DSMART algorithm is crucial to obtain consistent soil maps with clear internal disaggregation of SMU across the landscape.