A Comparative Assessment of Geostatistical, Machine Learning, and Hybrid Approaches for Mapping Topsoil Organic Carbon Content

Accurate digital soil mapping (DSM) of soil organic carbon (SOC) is still a challenging subject because of its spatial variability and dependency. This study is aimed at comparing six typical methods in three types of DSM techniques for SOC mapping in an area surrounding Changchun in Northeast China. The methods include ordinary kriging (OK) and geographically weighted regression (GWR) from geostatistics, support vector machines for regression (SVR) and artificial neural networks (ANN) from machine learning, and geographically weighted regression kriging (GWRK) and artificial neural networks kriging (ANNK) from hybrid approaches. The hybrid approaches, in particular, integrated the GWR from geostatistics and ANN from machine learning with the estimation of residuals by ordinary kriging, respectively. Environmental variables, including soil properties, climatic, topographic, and remote sensing data, were used for modeling. The mapping results of SOC content from different models were validated by independent testing data based on values of the mean error, root mean squared error and coefficient of determination. The prediction maps depicted spatial variation and patterns of SOC content of the study area. The results showed the accuracy ranking of the compared methods in decreasing order was ANNK, SVR, ANN, GWRK, OK, and GWR. Two-step hybrid approaches performed better than the corresponding individual models, and non-linear models performed better than the linear models. When considering the uncertainty and efficiency, ML and two-step approach are more suitable than geostatistics in regional landscapes with the high heterogeneity. The study concludes that ANNK is a promising approach for mapping SOC content at a local scale.

[1]  E. Davidson,et al.  Temperature sensitivity of soil carbon decomposition and feedbacks to climate change , 2006, Nature.

[2]  H. Tiessen,et al.  The role of soil organic matter in sustaining soil fertility , 1994, Nature.

[3]  Shen Yu,et al.  Non-Algorithmically Integrating Land Use Type with Spatial Interpolation of Surface Soil Nutrients in an Urbanizing Watershed , 2017 .

[4]  Gangcai Liu,et al.  Spatial prediction of soil organic matter content integrating artificial neural network and ordinary kriging in Tibetan Plateau , 2014 .

[5]  R. Kerry,et al.  Digital mapping of soil organic carbon at multiple depths using different data mining techniques in Baneh region, Iran , 2016 .

[6]  Peter Scull,et al.  A Top-Down Approach to the State Factor Paradigm for Use in Macroscale Soil Analysis , 2010 .

[7]  Heiko Balzter,et al.  Modelling forest canopy height by integrating airborne LiDAR samples with satellite Radar and multispectral imagery , 2018, Int. J. Appl. Earth Obs. Geoinformation.

[8]  Willem Waegeman,et al.  Comparison of statistical regression and data-mining techniques in estimating soil water retention of tropical delta soils. , 2017 .

[9]  R. Lark,et al.  Geostatistics for Environmental Scientists , 2001 .

[10]  R. Lal,et al.  Mapping the organic carbon stocks of surface soils using local spatial interpolator. , 2011, Journal of environmental monitoring : JEM.

[11]  David W. Franzen,et al.  Residual soil nitrate prediction from imagery and non-imagery information using neural network technique , 2011 .

[12]  Xingyi Zhang,et al.  Influence of topography and land management on soil nutrients variability in Northeast China , 2011, Nutrient Cycling in Agroecosystems.

[13]  A. Stewart Fotheringham,et al.  Geographically Weighted Regression: A Method for Exploring Spatial Nonstationarity , 2010 .

[14]  Daniel Richter,et al.  Effects of land-use history on soil spatial heterogeneity of macro- and trace elements in the Southern Piedmont USA. , 2010 .

[15]  Sandeep Kumar Estimating spatial distribution of soil organic carbon for the Midwestern United States using historical database. , 2015, Chemosphere.

[16]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[17]  Budiman Minasny,et al.  On digital soil mapping , 2003 .

[18]  Mohamad Sakizadeh,et al.  Support vector machine and artificial neural network to model soil pollution: a case study in Semnan Province, Iran , 2017, Neural Computing and Applications.

[19]  Frank Canters,et al.  A multiple regression approach to assess the spatial distribution of Soil Organic Carbon (SOC) at the regional scale (Flanders, Belgium) , 2008 .

[20]  Guillermo Trincado,et al.  Alternative approaches for estimating missing climate data: application to monthly precipitation records in South-Central Chile , 2018, Forest Ecosystems.

[21]  R. O’Brien,et al.  A Caution Regarding Rules of Thumb for Variance Inflation Factors , 2007 .

[22]  Markus Steffens,et al.  Carbon storage capacity of semi‐arid grassland soils and sequestration potentials in northern China , 2015, Global change biology.

[23]  Marek Pająk,et al.  Accumulative response of Scots pine (Pinus sylvestris L.) and silver birch (Betula pendula Roth) to heavy metals enhanced by Pb-Zn ore mining and processing plants: Explicitly spatial considerations of ordinary kriging based on a GIS approach. , 2017, Chemosphere.

[24]  Li Fang,et al.  Prediction of spatial distribution of soil nutrients using terrain attributes and remote sensing data. , 2010 .

[25]  Li Hui,et al.  Prediction of soil organic matter in peak-cluster depression region using kriging and terrain indices , 2014 .

[26]  Kristof Van Oost,et al.  Spatially-explicit regional-scale prediction of soil organic carbon stocks in cropland using environmental variables and mixed model approaches , 2013 .

[27]  Zhengqin Xiong,et al.  Global warming potential and greenhouse gas intensity in rice agriculture driven by high yields and nitrogen use efficiency , 2016 .

[28]  L. Wilding,et al.  Spatial variability: its documentation, accommodation and implication to soil surveys , 1985 .

[29]  Tomoki Nakaya,et al.  GWR 4 . 09 User Manual GWR 4 Windows Application for Geographically Weighted Regression Modelling , 2012 .

[30]  Yaolin Liu,et al.  Comparing geospatial techniques to predict SOC stocks , 2015 .

[31]  Yanhong Tang,et al.  Storage, patterns and controls of soil organic carbon in the Tibetan grasslands , 2008 .

[32]  G. Kiely,et al.  Towards spatial geochemical modelling: Use of geographically weighted regression for mapping soil organic carbon contents in Ireland , 2011 .

[33]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[34]  Hossein Asadi,et al.  Spatial variability of soil organic matter using remote sensing data , 2016 .

[35]  Martin Charlton,et al.  The Use of Geographically Weighted Regression for Spatial Prediction: An Evaluation of Models Using Simulated Data Sets , 2010 .

[36]  Saro Lee,et al.  Earthquake-induced landslide-susceptibility mapping using an artificial neural network , 2006 .

[37]  P. Burrough,et al.  Principles of geographical information systems , 1998 .

[38]  Luis A. Garcia,et al.  Comparison of Ordinary Kriging, Regression Kriging, and Cokriging Techniques to Estimate Soil Salinity Using LANDSAT Images , 2010 .

[39]  Alfred E. Hartemink,et al.  Total soil organic carbon and carbon sequestration potential in Nigeria , 2016 .

[40]  Rattan Lal,et al.  Predicting Soil Organic Carbon Stock Using Profile Depth Distribution Functions and Ordinary Kriging , 2009 .

[41]  Heba Elbasiouny,et al.  Spatial variation of soil carbon and nitrogen pools by using ordinary Kriging method in an area of north Nile Delta, Egypt , 2014 .

[42]  Hannah M. Cooper,et al.  Quantification of sawgrass marsh aboveground biomass in the coastal Everglades using object-based ensemble analysis and Landsat data , 2018 .

[43]  Inakwu O. A. Odeh,et al.  Catchment scale mapping of measureable soil organic carbon fractions , 2014 .

[44]  K. Becker,et al.  Analysis of microarray data using Z score transformation. , 2003, The Journal of molecular diagnostics : JMD.

[45]  Chris Brunsdon,et al.  Geographically Weighted Regression: The Analysis of Spatially Varying Relationships , 2002 .

[46]  Wim Cornelis,et al.  Enhanced pedotransfer functions with support vector machines to predict water retention of calcareous soil , 2016 .

[47]  Samad Emamgholizadeh,et al.  Comparison of artificial neural networks, geographically weighted regression and Cokriging methods for predicting the spatial distribution of soil macronutrients (N, P, and K) , 2017, Chinese Geographical Science.

[48]  Yang Ou,et al.  Spatio-temporal patterns of soil organic carbon and pH in relation to environmental factors—A case study of the Black Soil Region of Northeastern China , 2017 .

[49]  Tauqueer Ahmad,et al.  Comparison of various modelling approaches for water deficit stress monitoring in rice crop through hyperspectral remote sensing , 2019, Agricultural Water Management.

[50]  Wenjiang Huang,et al.  Effects of different sampling densities on geographically weighted regression kriging for predicting soil organic carbon , 2017 .

[51]  Margaret A. Oliver,et al.  A tutorial guide to geostatistics: Computing and modelling variograms and kriging , 2014 .

[52]  Yusuke Takata,et al.  Spatial prediction of soil organic matter in northern Kazakhstan based on topographic and vegetation information , 2007 .

[53]  A. Zhu,et al.  Mapping soil organic matter concentration at different scales using a mixed geographically weighted regression method , 2016 .

[54]  Juan J. Flores,et al.  The application of artificial neural networks to the analysis of remotely sensed data , 2008 .

[55]  D. W. Nelson,et al.  Total Carbon, Organic Carbon, and Organic Matter 1 , 1982 .

[56]  Rattan Lal,et al.  The knowns, known unknowns and unknowns of sequestration of soil organic carbon , 2013 .

[57]  Yiying Zhao,et al.  Machine learning for the prediction of L. chinensis carbon, nitrogen and phosphorus contents and understanding of mechanisms underlying grassland degradation. , 2017, Journal of environmental management.

[58]  Chao Gao,et al.  Prediction of soil organic carbon in an intensively managed reclamation zone of eastern China: A comparison of multiple linear regressions and the random forest model. , 2017, The Science of the total environment.

[59]  R. V. Rossel,et al.  Using data mining to model and interpret soil diffuse reflectance spectra. , 2010 .

[60]  David J. Chittleborough,et al.  The effect of terrain and management on the spatial variability of soil properties in an apple orchard , 2012 .

[61]  Martial Bernoux,et al.  National and sub-national assessments of soil organic carbon stocks and changes: The GEFSOC modelling system , 2007 .

[62]  R. Lal,et al.  Soil Carbon Sequestration Impacts on Global Climate Change and Food Security , 2004, Science.

[63]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[64]  Ruhollah Taghizadeh-Mehrjardi,et al.  Artificial bee colony feature selection algorithm combined with machine learning algorithms to predict vertical and lateral distribution of soil organic matter in South Dakota, USA , 2017 .

[65]  Christian Hergarten,et al.  Prediction of Soil Organic Carbon for Ethiopian Highlands Using Soil Spectroscopy , 2013 .

[66]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[67]  J. Drake,et al.  Modelling ecological niches with support vector machines , 2006 .

[68]  Kaishan Song,et al.  Spatial distribution of soil organic carbon and analysis of related factors in croplands of the black soil region, Northeast China , 2006 .

[69]  Bo Li,et al.  Spatial Prediction of Soil Organic Matter Using a Hybrid Geostatistical Model of an Extreme Learning Machine and Ordinary Kriging , 2017 .

[70]  R. Brereton,et al.  Support vector machines for classification and regression. , 2010, The Analyst.

[71]  Enli Wang,et al.  Crop production, soil carbon and nutrient balances as affected by fertilisation in a Mollisol agroecosystem , 2010, Nutrient Cycling in Agroecosystems.

[72]  R. B. Jackson,et al.  THE VERTICAL DISTRIBUTION OF SOIL ORGANIC CARBON AND ITS RELATION TO CLIMATE AND VEGETATION , 2000 .

[73]  John P. Wilson,et al.  Terrain analysis : principles and applications , 2000 .

[74]  Guo Xudong,et al.  Prediction of the spatial distribution of soil properties based on environmental correlation and geostatistics. , 2009 .

[75]  R. Bilonick An Introduction to Applied Geostatistics , 1989 .

[76]  Meiyan Wang,et al.  Comparison of multivariate methods for estimating selected soil properties from intact soil cores of paddy fields by Vis–NIR spectroscopy , 2018 .

[77]  Atul K. Jain,et al.  Global change pressures on soils from land use and management , 2016, Global change biology.

[78]  D. Bui,et al.  A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. , 2015 .

[79]  Yufeng Ge,et al.  Moisture insensitive prediction of soil properties from VNIR reflectance spectra based on external parameter orthogonalization , 2016 .

[80]  Martin Hermy,et al.  Assessing soil organic carbon stocks under current and potential forest cover using digital soil mapping and spatial generalisation , 2017 .

[81]  Sabine Grunwald,et al.  Digital mapping of soil carbon fractions with machine learning , 2019, Geoderma.

[82]  Rattan Lal,et al.  A geographically weighted regression kriging approach for mapping soil organic carbon stock , 2012 .

[83]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[84]  Yaolin Liu,et al.  Comparisons of spatial and non-spatial models for predicting soil carbon content based on visible and near-infrared spectral technology , 2017 .

[85]  A. Konopka,et al.  FIELD-SCALE VARIABILITY OF SOIL PROPERTIES IN CENTRAL IOWA SOILS , 1994 .

[86]  Jungho Im,et al.  Support vector machines in remote sensing: A review , 2011 .