Machine learning for digital soil mapping: Applications, challenges and suggested solutions

Abstract The uptake of machine learning (ML) algorithms in digital soil mapping (DSM) is transforming the way soil scientists produce their maps. Within the past two decades, soil scientists have applied ML to a wide range of scenarios, by mapping soil properties or classes with various ML algorithms, on spatial scale from the local to the global, and with depth. The wide adoption of ML for soil mapping was made possible by the increase in data availability, the ease of accessing environmental spatial data, and the development of software solutions aided by computational tools to analyse them. In this article, we review the current use of ML in DSM, identify the key challenges and suggest solutions from the existing literature. There is a growing interest in the use of ML in DSM. Most studies emphasize prediction and accuracy of the predicted maps for applications, such as baseline production of quantitative soil information. Few studies account for existing soil knowledge in the modelling process or quantify the uncertainty of the predicted maps. Further, we discuss the challenges related to the application of ML for soil mapping and suggest solutions from existing studies in the natural sciences. The challenges are: sampling, resampling, accounting for the spatial information, multivariate mapping, uncertainty analysis, validation, integration of pedological knowledge and interpretation of the models. Overall, the current literature shows few attempts in understanding the underlying soil structure or process using the predicted maps and the ML model, for example by generating hypotheses on mechanistic relationships among variables. In this regard, several additional challenging aspects need to be considered, such as the inclusion of pedological knowledge in the ML algorithm or the interpretability of the calibrated ML model. Tackling these challenges is critical for ML to gain credibility and scientific consistency in soil science. We conclude that for future developments, ML could incorporate three core elements: plausibility, interpretability, and explainability, which will trigger soil scientists to couple model prediction with pedological explanation and understanding of the underlying soil processes.

[1]  D. Bui,et al.  A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. , 2015 .

[2]  Budiman Minasny,et al.  Addressing the issue of digital mapping of soil classes with imbalanced class observations , 2019, Geoderma.

[3]  Johannes Schmidt,et al.  Improving the Spatial Prediction of Soil Organic Carbon Stocks in a Complex Tropical Mountain Landscape by Methodological Specifications in Machine Learning Approaches , 2016, PloS one.

[4]  Lalit Kumar,et al.  Digital soil mapping algorithms and covariates for soil organic carbon mapping and their implications: A review , 2019, Geoderma.

[5]  Martin Hermy,et al.  Assessing soil organic carbon stocks under current and potential forest cover using digital soil mapping and spatial generalisation , 2017 .

[6]  Chuck Bulmer,et al.  Large, climate-sensitive soil carbon stocks mapped with pedology-informed machine learning in the North Pacific coastal temperate rainforest , 2019, Environmental Research Letters.

[7]  Alexander Brenning,et al.  Data Mining in Precision Agriculture: Management of Spatial Information , 2010, IPMU.

[8]  John Triantafilis,et al.  Predicting and mapping of soil particle‐size fractions with adaptive neuro‐fuzzy inference and ant colony optimization in central Iran , 2016 .

[9]  Gangcai Liu,et al.  Spatial prediction of soil organic matter content integrating artificial neural network and ordinary kriging in Tibetan Plateau , 2014 .

[10]  Sabine Grunwald,et al.  Digital mapping of soil carbon fractions with machine learning , 2019, Geoderma.

[11]  Mogens Humlekrog Greve,et al.  Mapping soil organic matter contents at field level with Cubist, Random Forest and kriging , 2019, Geoderma.

[12]  Michelangelo Ceci,et al.  Dealing with spatial autocorrelation when learning predictive clustering trees , 2013, Ecol. Informatics.

[13]  Gerard B. M. Heuvelink,et al.  Modelling soil variation: past, present, and future , 2001 .

[14]  H. Elsenbeer,et al.  Soil organic carbon concentrations and stocks on Barro Colorado Island — Digital soil mapping using Random Forests analysis , 2008 .

[15]  Alfred E. Hartemink,et al.  Digital Mapping of Soil Organic Carbon Contents and Stocks in Denmark , 2014, PloS one.

[16]  Jérôme M. B. Louis,et al.  Copernicus Sentinel-2A Calibration and Products Validation Status , 2017, Remote. Sens..

[17]  Carsten F. Dormann,et al.  Less than eight (and a half) misconceptions of spatial analysis , 2012 .

[18]  Emil Pitkin,et al.  Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation , 2013, 1309.6392.

[19]  Bradley A. Miller,et al.  Comparison of spatial association approaches for landscape mapping of soil organic carbon stocks , 2014 .

[20]  Alexandre M.J.C. Wadoux,et al.  Using deep learning for multivariate mapping of soil with quantified uncertainty , 2019, Geoderma.

[21]  Dominique Arrouays,et al.  National versus global modelling the 3D distribution of soil organic carbon in mainland France , 2016 .

[22]  Elisabeth N. Bui,et al.  Extracting soil-landscape rules from previous soil surveys , 1999 .

[23]  A-Xing Zhu,et al.  Multi-scale digital terrain analysis and feature selection for digital soil mapping , 2010 .

[24]  Peter Finke,et al.  Comparing the efficiency of digital and conventional soil mapping to predict soil types in a semi-arid region in Iran , 2017 .

[25]  M. Siewert,et al.  High-resolution digital mapping of soil organic carbon in permafrost terrain using machine learning : a case study in a sub-Arctic peatland environment , 2017 .

[26]  Fei Yang,et al.  Pedoclimatic zone-based three-dimensional soil organic carbon mapping in China , 2020 .

[27]  E. R. Levine,et al.  Predicting Soil Drainage Class Using Remotely Sensed and Digital Elevation Data , 1997 .

[28]  Mario Guevara,et al.  No silver bullet for digital soil mapping: country-specific soil organic carbon estimates across Latin America , 2018, SOIL.

[29]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[30]  Vincent Bretagnolle,et al.  Spatial leave‐one‐out cross‐validation for variable selection in the presence of spatial autocorrelation , 2014 .

[31]  Stephen E. Fick,et al.  WorldClim 2: new 1‐km spatial resolution climate surfaces for global land areas , 2017 .

[32]  R. Kerry,et al.  Digital mapping of soil organic carbon at multiple depths using different data mining techniques in Baneh region, Iran , 2016 .

[33]  Mark Gahegan,et al.  The Integration of Geographic Visualization with Knowledge Discovery in Databases and Geocomputation , 2001 .

[34]  T. Mayr,et al.  Two Methods for Using Legacy Data in Digital Soil Mapping , 2010 .

[35]  David Clifford,et al.  The Australian three-dimensional soil grid: Australia’s contribution to the GlobalSoilMap project , 2015 .

[36]  Mohammad Jamshidi,et al.  Synthetic resampling strategies and machine learning for digital soil mapping in Iran , 2020, European Journal of Soil Science.

[37]  Ribana Roscher,et al.  Explainable Machine Learning for Scientific Insights and Discoveries , 2019, IEEE Access.

[38]  M. A. Oliver,et al.  Geostatistics and its application to soil science , 1987 .

[39]  Thomas Nauss,et al.  Importance of spatial predictor variable selection in machine learning applications - Moving from data reproduction to spatial prediction , 2019, Ecological Modelling.

[40]  John P. Morgan,et al.  Universally optimal designs with blocksize $p\times 2$ and correlated observations , 1997 .

[41]  Alfred E. Hartemink,et al.  Digital Mapping of Soil Particle-Size Fractions for Nigeria Pedology , 2022 .

[42]  Richard Webster,et al.  Fluctuations in method‐of‐moments variograms caused by clustered sampling and their elimination by declustering and residual maximum likelihood estimation , 2013 .

[43]  Elisabeth N. Bui,et al.  Spatial data mining for enhanced soil map modelling , 2002, Int. J. Geogr. Inf. Sci..

[44]  Blandine Lemercier,et al.  Spatial disaggregation of complex Soil Map Units at the regional scale based on soil-landscape relationships , 2015 .

[45]  John Triantafilis,et al.  Digital Mapping of Soil Classes Using Ensemble of Models in Isfahan Region, Iran , 2019, Soil Systems.

[46]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[47]  Joseph H. A. Guillaume,et al.  Characterising performance of environmental models , 2013, Environ. Model. Softw..

[48]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[49]  László Pásztor,et al.  Facing the peat CO2 threat: digital mapping of Indonesian peatlands—a proposed methodology and its application , 2019, Journal of Soils and Sediments.

[50]  Philippe Lagacherie,et al.  Digital Soil Mapping: A State of the Art , 2008 .

[51]  B. Minasny,et al.  On digital soil mapping , 2003 .

[52]  Budiman Minasny,et al.  Mapping continuous depth functions of soil carbon storage and available water capacity , 2009 .

[53]  Nathan P. Odgers,et al.  Spatial disaggregation of conventional soil mapping across Western Australia using DSMART , 2014 .

[54]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[55]  Laura Poggio,et al.  A note on knowledge discovery and machine learning in digital soil mapping , 2019, European Journal of Soil Science.

[56]  Philip E. Dennison,et al.  Inductively mapping expert-derived soil-landscape units within dambo wetland catenae using multispectral and topographic data , 2009 .

[57]  Bo Li,et al.  Predicting Spatial Variations in Soil Nutrients with Hyperspectral Remote Sensing at Regional Scale , 2018, Sensors.

[58]  André Beaudoin,et al.  Digital mapping of soil properties in Canadian managed forests at 250m of resolution using the k-nearest neighbor method , 2014 .

[59]  M. Schaepman,et al.  Evaluation of digital soil mapping approaches with large sets of environmental covariates , 2017 .

[60]  Olivier Evrard,et al.  Effectiveness of landscape decontamination following the Fukushima nuclear accident: a review , 2019, SOIL.

[61]  Alexander Brenning,et al.  Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data , 2019, Ecological Modelling.

[62]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[63]  M. Kovacevic,et al.  Soil type classification and estimation of soil properties using support vector machines , 2010 .

[64]  Alexei Pozdnoukhov,et al.  Monitoring network optimisation for spatial data classification using support vector machines , 2006 .

[65]  T. Behrens,et al.  Predicting reference soil groups using legacy data: A data pruning and Random Forest approach for tropical environment (Dano catchment, Burkina Faso) , 2018, Scientific Reports.

[66]  Diana H. Wall,et al.  Soil nematode abundance and functional group composition at a global scale , 2019, Nature.

[67]  Hossein Shafizadeh-Moghadam,et al.  Exploring the driving forces and digital mapping of soil organic carbon using remote sensing and soil texture , 2019, CATENA.

[68]  Yanguo Teng,et al.  Machine-learning models for on-site estimation of background concentrations of arsenic in soils using soil formation factors , 2016, Journal of Soils and Sediments.

[69]  Waldir de Carvalho Junior,et al.  Spatial prediction of soil surface texture in a semiarid region using random forest and multiple linear regressions , 2016 .

[70]  E. Fegraus,et al.  Soil nutrient maps of Sub-Saharan Africa: assessment of soil nutrient content at 250 m spatial resolution using machine learning , 2017, Nutrient Cycling in Agroecosystems.

[71]  Lin Li,et al.  Multi-output least-squares support vector regression machines , 2013, Pattern Recognit. Lett..

[72]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[73]  David Beamish,et al.  A machine learning approach to geochemical mapping , 2016 .

[74]  Devis Tuia,et al.  Active learning for monitoring network optimization , 2012 .

[75]  Budiman Minasny,et al.  Multi-source data integration for soil mapping using deep learning , 2018, SOIL.

[76]  Alexander Brenning,et al.  Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: The R package sperrorest , 2012, 2012 IEEE International Geoscience and Remote Sensing Symposium.

[77]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[78]  Budiman Minasny,et al.  Comparing three approaches of spatial disaggregation of legacy soil maps based on DSMART algorithm , 2019 .

[79]  Tommy Dalgaard,et al.  Spatial soil zinc content distribution from terrain parameters: a GIS-based decision-tree model in Lebanon. , 2010, Environmental pollution.

[80]  Rudiyanto,et al.  Open digital mapping as a cost-effective method for mapping peat thickness and assessing the carbon stock of tropical peatlands , 2018 .

[81]  B. Henderson,et al.  Australia-wide predictions of soil properties using decision trees , 2005 .

[82]  Hugo Larochelle,et al.  Neural Autoregressive Distribution Estimation , 2016, J. Mach. Learn. Res..

[83]  Denis Allard,et al.  CART algorithm for spatial data: Application to environmental and ecological data , 2009, Comput. Stat. Data Anal..

[84]  Dick J. Brus,et al.  Sampling for digital soil mapping: A tutorial supported by R scripts , 2019, Geoderma.

[85]  Charlie Chen,et al.  Digitally mapping the information content of visible–near infrared spectra of surficial Australian soils , 2011 .

[86]  S. Kanae,et al.  A high‐accuracy map of global terrain elevations , 2017 .

[87]  Mark Gahegan,et al.  Fourth paradigm GIScience? Prospects for automated discovery and explanation from data , 2019, Int. J. Geogr. Inf. Sci..

[88]  Bradley A. Miller,et al.  Impact of Multi-Scale Predictor Selection for Modeling Soil Properties , 2015 .

[89]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[90]  Laura Poggio,et al.  Modelling the extent of northern peat soil and its uncertainty with Sentinel: Scotland as example of highly cloudy region , 2019, Geoderma.

[91]  W. C. Krumbein : Factors of Soil Formation: A System of Quantitative Pedology , 1942 .

[92]  Maria Papadopoulou,et al.  Assessment of spatial hybrid methods for predicting soil organic matter using DEM derivatives and soil parameters , 2019, CATENA.

[93]  Gerard B. M. Heuvelink,et al.  Machine learning in space and time for modelling soil organic carbon change , 2020, European Journal of Soil Science.

[94]  B. Schröder,et al.  Spatial disaggregation of complex soil map units: A decision-tree based approach in Bavarian forest soils , 2012 .

[95]  G. Heuvelink,et al.  SoilGrids1km — Global Soil Information Based on Automated Mapping , 2014, PloS one.

[96]  Peter Finke,et al.  Digital mapping of soil properties using multiple machine learning in a semi-arid region, central Iran , 2019, Geoderma.

[97]  Ravinesh C. Deo,et al.  Soil organic carbon in semiarid alpine regions: the spatial distribution, stock estimation, and environmental controls , 2019, Journal of Soils and Sediments.

[98]  Budiman Minasny,et al.  Using deep learning for digital soil mapping , 2018, SOIL.

[99]  Dominique Arrouays,et al.  GlobalSoilMap : Basis of the global spatial soil information system , 2014 .

[100]  Geir-Arne Fuglstad,et al.  Predicting soil properties in the Canadian boreal forest with limited data: Comparison of spatial and non-spatial statistical approaches , 2017 .

[101]  Feng Liu,et al.  Comparison of boosted regression tree and random forest models for mapping topsoil organic carbon concentration in an alpine ecosystem , 2016 .

[102]  J.G.B. Leenaars,et al.  WoSIS: providing standardised soil profile data for the world , 2016 .

[103]  Suresh Kumar,et al.  Digital soil mapping in a Himalayan watershed using remote sensing and terrain parameters employing artificial neural network model , 2018, Environmental Earth Sciences.

[104]  Béla Pirkó,et al.  Spatio-temporal assessment of topsoil organic carbon stock change in Hungary , 2019 .

[105]  M. Wiesmeier,et al.  Digital mapping of soil organic matter stocks using Random Forest modeling in a semi-arid steppe ecosystem , 2011, Plant and Soil.

[106]  Forrest R. Stevens,et al.  Assessing the spatial sensitivity of a random forest model: Application in gridded population modeling , 2019, Comput. Environ. Urban Syst..

[107]  Philippe Lagacherie,et al.  Evaluating Digital Soil Mapping approaches for mapping GlobalSoilMap soil properties from legacy data in Languedoc-Roussillon (France) , 2015 .

[108]  P. Legendre,et al.  Variation partitioning of species data matrices: estimation and comparison of fractions. , 2006, Ecology.

[109]  Dominique Arrouays,et al.  Probability mapping of soil thickness by random survival forest at a national scale , 2019, Geoderma.

[110]  M. R. Pahlavan-Rad,et al.  Spatial variability of soil texture fractions and pH in a flood plain (case study from eastern Iran) , 2018 .

[111]  Fei Yang,et al.  High-resolution and three-dimensional mapping of soil texture of China , 2020 .

[112]  Jin Zhang,et al.  An overview and comparison of machine-learning techniques for classification purposes in digital soil mapping , 2016 .

[113]  B. Engelen,et al.  A world soils and terrain digital database (SOTER) — An improved assessment of land resources , 1993 .

[114]  Tomislav Hengl,et al.  Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation , 2018, Environ. Model. Softw..

[115]  G. L'Abate,et al.  Comparing data mining and deterministic pedology to assess the frequency of WRB reference soil groups in the legend of small scale maps , 2015 .

[116]  Brian K. Slater,et al.  Soil Series Mapping By Knowledge Discovery from an Ohio County Soil Map , 2013 .

[117]  O. Hagolle,et al.  The MODIS (collection V006) BRDF/albedo product MCD43D: Temporal course evaluated over agricultural landscape , 2015 .

[118]  Jens Hartmann,et al.  The new global lithological map database GLiM: A representation of rock properties at the Earth surface , 2012 .

[119]  Marvin N. Wright,et al.  SoilGrids250m: Global gridded soil information based on machine learning , 2017, PloS one.

[120]  Catherine Linard,et al.  Geographical random forests: a spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling , 2019, Geocarto International.

[121]  Karin Viergever,et al.  Using knowledge discovery with data mining from the Australian Soil Resource Information System database to inform soil carbon mapping in Australia , 2009 .

[122]  Philippe Lagacherie,et al.  Using quantile regression forest to estimate uncertainty of digital soil mapping products , 2017 .

[123]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[124]  E. Dougherty,et al.  Big data need big theory too , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[125]  Karsten Schmidt,et al.  Multi-scale digital soil mapping with deep learning , 2018, Scientific Reports.

[126]  Qianlai Zhuang,et al.  Mapping stocks of soil organic carbon and soil total nitrogen in Liaoning Province of China , 2017 .

[127]  Malcolm Coull,et al.  Mapping soil carbon stocks across Scotland using a neural network model , 2016 .

[128]  Alex B. McBratney,et al.  On the role of expert systems and numerical taxonomy in soil classification , 1989 .

[129]  Nagiza F. Samatova,et al.  Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data , 2016, IEEE Transactions on Knowledge and Data Engineering.

[130]  Adriaan van Niekerk,et al.  Machine learning performance for predicting soil salinity using different combinations of geomorphometric covariates , 2017 .

[131]  László Pásztor,et al.  Comparison of various uncertainty modelling approaches based on geostatistics and machine learning algorithms , 2019, Geoderma.

[132]  Mark R. Segal,et al.  Multivariate random forests , 2011, WIREs Data Mining Knowl. Discov..

[133]  Travis W. Nauman,et al.  Relative prediction intervals reveal larger uncertainty in 3D approaches to predictive digital soil mapping of soil properties with legacy data , 2019, Geoderma.

[134]  Gerard B. M. Heuvelink,et al.  Sampling design optimization for soil mapping with random forest , 2019 .

[135]  D. J. Brus,et al.  Sampling for Natural Resource Monitoring , 2006 .

[136]  Thorsten Behrens,et al.  Digital soil mapping using artificial neural networks , 2005 .

[137]  Colby Brungard,et al.  Soil Property and Class Maps of the Conterminous United States at 100-Meter Spatial Resolution , 2018 .

[138]  C. Gomez,et al.  Analysing the impact of soil spatial sampling on the performances of Digital Soil Mapping models and their evaluation: A numerical experiment on Quantile Random Forest using clay contents obtained from Vis-NIR-SWIR hyperspectral imagery , 2020 .

[139]  Budiman Minasny,et al.  High resolution 3D mapping of soil organic carbon in a heterogeneous agricultural landscape , 2014 .

[140]  Michael Thiel,et al.  High Resolution Mapping of Soil Properties Using Remote Sensing Variables in South-Western Burkina Faso: A Comparison of Machine Learning and Multiple Linear Regression Models , 2017, PloS one.

[141]  Mark Gahegan,et al.  On the Application of Inductive Machine Learning Tools to Geographical Analysis , 2010 .

[142]  Simon Stisen,et al.  Modeling Depth of the Redox Interface at High Resolution at National Scale Using Random Forest and Residual Gaussian Simulation , 2019, Water Resources Research.

[143]  P. Scull,et al.  The application of classification tree analysis to soil type prediction in a desert landscape , 2005 .

[144]  Yoan Fourcade,et al.  Paintings predict the distribution of species, or the challenge of selecting environmental predictors and evaluation statistics , 2018 .

[145]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[146]  E.M. Baglaeva,et al.  Combining spatial autocorrelation with machine learning increases prediction accuracy of soil heavy metals , 2019, CATENA.

[147]  Wei Sun,et al.  Disaggregating and harmonising soil map units through resampled classification trees , 2014 .

[148]  De Li Liu,et al.  High resolution mapping of soil organic carbon stocks using remote sensing variables in the semi-arid rangelands of eastern Australia. , 2018, The Science of the total environment.

[149]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[150]  Donald W. Braben Innovation and academic research , 1985, Nature.

[151]  J. W. van Groenigen Spatial Simulated Annealing for Optimizing Sampling , 1997 .

[152]  T. Behrens,et al.  Spatial modelling with Euclidean distance fields and machine learning , 2018, European Journal of Soil Science.

[153]  Joachim Denzler,et al.  Deep learning and process understanding for data-driven Earth system science , 2019, Nature.

[154]  S. K. Singh,et al.  Spatial prediction of major soil properties using Random Forest techniques - A case study in semi-arid tropics of South India , 2017 .

[155]  Keith McCloy,et al.  Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: the case study of Denmark. , 2010, Journal of environmental management.

[156]  Jukka Heikkonen,et al.  Estimating the prediction performance of spatial models via spatial k-fold cross validation , 2017, Int. J. Geogr. Inf. Sci..

[157]  David Lopez-Paz,et al.  Single-Model Uncertainties for Deep Learning , 2018, NeurIPS.

[158]  Philippe Lagacherie,et al.  Chapter 1 Spatial Soil Information Systems and Spatial Soil Inference Systems: Perspectives for Digital Soil Mapping , 2006 .

[159]  Budiman Minasny,et al.  Mapping soil organic carbon content over New South Wales, Australia using local regression kriging , 2016 .

[160]  Philippe Lagacherie,et al.  Addressing Geographical Data Errors in a Classification Tree for Soil Unit Prediction , 1997, Int. J. Geogr. Inf. Sci..

[161]  Gerard B. M. Heuvelink,et al.  Random Forest Spatial Interpolation , 2020, Remote. Sens..

[162]  Zohreh Mosleh,et al.  The effectiveness of digital soil mapping to predict soil properties over low-relief areas , 2016, Environmental Monitoring and Assessment.

[163]  Elpídio Inácio Fernandes Filho,et al.  Modelling and mapping soil organic carbon stocks in Brazil , 2019, Geoderma.

[164]  Thomas C. Edwards,et al.  Machine learning for predicting soil classes in three semi-arid landscapes , 2015 .

[165]  Saso Dzeroski,et al.  Inductive process modeling , 2008, Machine Learning.

[166]  Budiman Minasny,et al.  Estimation and potential improvement of the quality of legacy soil samples for digital soil mapping , 2007 .

[167]  Ron Corstanje,et al.  The application of expert knowledge in Bayesian networks to predict soil bulk density at the landscape scale , 2015 .

[168]  Budiman Minasny,et al.  Pedology and digital soil mapping (DSM) , 2019, European Journal of Soil Science.

[169]  Marvin N. Wright,et al.  Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables , 2018, PeerJ.

[170]  Mikhail Kanevski,et al.  Machine Learning Feature Selection Methods for Landslide Susceptibility Mapping , 2013, Mathematical Geosciences.

[171]  Patricio Crespo,et al.  Spatial prediction of soil water retention in a Páramo landscape: Methodological insight into machine learning using random forest , 2018 .

[172]  Shashi Shekhar,et al.  Spatial Ensemble Learning for Heterogeneous Geographic Data with Class Ambiguity: A Summary of Results , 2017, SIGSPATIAL/GIS.

[173]  Brian K. Slater,et al.  Mapping numerically classified soil taxa in Kilombero Valley, Tanzania using machine learning , 2018 .

[174]  Sarah Schönbrodt-Stitt,et al.  Incorporating limited field operability and legacy soil samples in a hypercube sampling design for digital soil mapping , 2016 .

[175]  B. Huwe,et al.  Uncertainty in the spatial prediction of soil texture: Comparison of regression tree and Random Forest models , 2012 .

[176]  Budiman Minasny,et al.  Digital mapping of soil salinity in Ardakan region, central Iran , 2014 .

[177]  Carsten F. Dormann,et al.  Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure , 2017 .

[178]  Avner Bar-Hen,et al.  A spatial extension of CART: application to classification of ecological data. , 2005 .

[179]  Ranadip Pal,et al.  IntegratedMRF: random forest‐based framework for integrating prediction from different data types , 2017, Bioinform..

[180]  Alexander Binder,et al.  Unmasking Clever Hans predictors and assessing what machines really learn , 2019, Nature Communications.

[181]  Tao Pei,et al.  Machine‐Learning Variables at Different Scales vs. Knowledge‐based Variables for Mapping Multiple Soil Properties , 2018 .

[182]  Shamsollah Ayoubi,et al.  Digital mapping of soil invertebrates using environmental attributes in a deciduous forest ecosystem , 2019, Geoderma.

[183]  Tomislav Hengl,et al.  Spatio-temporal interpolation of soil water, temperature, and electrical conductivity in 3D + T: The Cook Agronomy Farm data set , 2015 .

[184]  A-Xing Zhu,et al.  Comparison of conditioned Latin hypercube and feature space coverage sampling for predicting soil classes using simulation from soil maps , 2020 .

[185]  Ingolf Kühn,et al.  Combining spatial and phylogenetic eigenvector filtering in trait analysis , 2009 .

[186]  Budiman Minasny,et al.  More Data or a Better Model? Figuring Out What Matters Most for the Spatial Prediction of Soil Carbon , 2017 .

[187]  Alireza Karimi,et al.  Digital soil mapping using remote sensing indices, terrain attributes, and vegetation features in the rangelands of northeastern Iran , 2017, Environmental Monitoring and Assessment.

[188]  Bradford A. Hawkins,et al.  Eight (and a half) deadly sins of spatial analysis , 2012 .

[189]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[190]  Brandon M. Greenwell,et al.  Interpretable Machine Learning , 2019, Hands-On Machine Learning with R.

[191]  Alexandre M.J.C. Wadoux,et al.  Sampling design optimization for geostatistical modelling and prediction , 2019 .