Modeling soil bulk density through a complete data scanning procedure: Heuristic alternatives

Abstract Soil bulk density (BD) is very important factor in land drainage and reclamation, irrigation scheduling (for estimating the soil volumetric water content), and assessing soil carbon and nutrient stock as well as determining the pollutant mass balance in soils. Numerous pedotransfer functions have been suggested so far to relate the soil BD values to soil parameters (e.g. soil separates, carbon content, etc). The present paper aims at simulating soil BD using easily measured soil variables through heuristic gene expression programming (GEP), neural networks (NN), random forest (RF), support vector machine (SVM), and boosted regression trees (BT) techniques. The statistical Gamma test was utilized to identify the most influential soil parameters on BD. The applied models were assessed through k-fold testing where all the available data patterns were involved in the both training and testing stages, which provide an accurate assessment of the models accuracy. Some existing pedotransfer functions were also applied and compared with the heuristic models. The obtained results revealed that the heuristic GEP model outperformed the other applied models globally and per test stage. Nevertheless, the performance accuracy of the applied heuristic models was much better than those of the applied pedotransfer functions. Using k-fold testing provides a more-in-detail judgment of the models.

[1]  Larry M. Deschaine,et al.  Decision support for complex planning challenges - Combining expert systems, engineering-oriented modeling, machine learning, information theory, and optimization technology , 2014 .

[2]  G. W. Thomas Soil pH and Soil Acidity , 1996, SSSA Book Series.

[3]  James M. Vose,et al.  The influence of watershed characteristics on spatial patterns of trends in annual scale streamflow variability in the continental U.S. , 2016 .

[4]  O. Kisi,et al.  Daily reference evapotranspiration modeling by using genetic programming approach in the Basque Country (Northern Spain) , 2012 .

[5]  Budiman Minasny,et al.  The neuro-m method for fitting neural network parametric pedotransfer functions , 2002 .

[6]  Özgür Kisi,et al.  Modeling soil cation exchange capacity using soil parameters , 2017 .

[7]  S. Gunn Support Vector Machines for Classification and Regression , 1998 .

[8]  M. Meirvenne,et al.  Predictive Quality of Pedotransfer Functions for Estimating Bulk Density of Forest Soils , 2005 .

[9]  Henrique N. Cabral,et al.  Predicting fish species richness in estuaries: Which modelling technique to use? , 2015, Environ. Model. Softw..

[10]  Gerard B. M. Heuvelink,et al.  Modelling soil variation: past, present, and future , 2001 .

[11]  Frank D. Francone,et al.  Extending the boundaries of design optimization by integrating fast optimization techniques with machine-code-based, linear genetic programming , 2004, Inf. Sci..

[12]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[13]  Jalal Shiri,et al.  Artificial neural networks vs. Gene Expression Programming for estimating outlet dissolved oxygen in micro-irrigation sand filters fed with effluents , 2013 .

[14]  W. A. Adams THE EFFECT OF ORGANIC MATTER ON THE BULK AND TRUE DENSITIES OF SOME UNCULTIVATED PODZOLIC SOILS , 1973 .

[15]  Cândida Ferreira,et al.  Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence , 2014, Studies in Computational Intelligence.

[16]  D. W. Nelson,et al.  Total Carbon, Organic Carbon, and Organic Matter , 1983, SSSA Book Series.

[17]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[18]  Hussein A. Abbass,et al.  Heuristics and optimization for knowledge discovery , 2002 .

[19]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[20]  Jan De Pue,et al.  Hierarchical Pedotransfer Functions to Predict Bulk Density of Highly Weathered Soils in Central Africa , 2015 .

[21]  E. Van Ranst,et al.  Nonparametric Techniques for Predicting Soil Bulk Density of Tropical Rainforest Topsoils in Rwanda , 2012 .

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  R. Nelson Carbonate and Gypsum , 2015 .

[24]  M. Adrover,et al.  Chemical properties and biological activity in soils of Mallorca following twenty years of treated wastewater irrigation. , 2012, Journal of environmental management.

[25]  Özgür Kisi,et al.  Modeling rainfall-runoff process using soft computing techniques , 2013, Comput. Geosci..

[26]  M. Schaap,et al.  Using neural networks to predict soil water retention and soil hydraulic conductivity , 1998 .

[27]  Antonia J. Jones,et al.  The Construction of Smooth Models using Irregular Embeddings Determined by a Gamma Test Analysis , 2002, Neural Computing & Applications.

[28]  O. Kisi,et al.  Suspended sediment modeling using genetic programming and soft computing techniques , 2012 .

[29]  D. Jeffrey A note on the use of ignition loss as a means for the approximate estimation of soil bulk density. , 1970 .

[30]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[31]  Halil Ibrahim Erdal,et al.  Advancing monthly streamflow prediction accuracy of CART models using ensemble learning paradigms , 2013 .

[32]  A. Cortizas,et al.  A Pedotransfer Function to Map Soil Bulk Density from Limited Data , 2015 .

[33]  A. Mermoud,et al.  Comparative analysis of three methods to generate soil hydraulic functions , 2006 .

[34]  Özgür Kisi,et al.  River suspended sediment estimation by climatic variables implication: Comparative study among soft computing techniques , 2012, Comput. Geosci..

[35]  Ozgur Kisi,et al.  Comparison of heuristic and empirical approaches for estimating reference evapotranspiration from limited inputs in Iran , 2014 .

[36]  G. Gee,et al.  Particle-size Analysis , 2018, SSSA Book Series.

[37]  L. Montanarella,et al.  Estimating forest soil bulk density using boosted regression modelling , 2010 .

[38]  Mohammed I. Al-Qinna,et al.  Predicting Soil Bulk Density Using Advanced Pedotransfer Functions in an Arid Environment , 2013 .

[39]  J. Vose,et al.  Continental U.S. streamflow trends from 1940 to 2009 and their relationships with watershed spatial characteristics , 2015 .

[40]  Jack F. Paris,et al.  A Physicoempirical Model to Predict the Soil Moisture Characteristic from Particle-Size Distribution and Bulk Density Data 1 , 1981 .

[41]  Juan de la Riva,et al.  An insight into machine-learning algorithms to model human-caused wildfire occurrence , 2014, Environ. Model. Softw..

[42]  Zhen Lin,et al.  Choosing SNPs using feature selection , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[43]  J Elith,et al.  A working guide to boosted regression trees. , 2008, The Journal of animal ecology.

[44]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[45]  Ozgur Kisi,et al.  Estimation of Daily Suspended Sediment Load by Using Wavelet Conjunction Models , 2012 .

[46]  C. Federer Nitrogen Mineralization and Nitrification: Depth Variation in Four New England Forest Soils , 1983 .

[47]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[48]  T. Kutser,et al.  Dissolved organic carbon and its potential predictors in eutrophic lakes. , 2016, Water research.

[49]  Soil Bulk Density and Penetration Resistance under Different Tillage and Crop Management Systems and Their Relationship with Barley Root Growth , 2003 .

[50]  N. Patil,et al.  Estimation of bulk density of waterlogged soils from basic properties , 2012 .

[51]  Vijay P. Singh,et al.  Evaluation of gene expression programming approaches for estimating daily evaporation through spatial and temporal data scanning , 2014 .

[52]  Anthony R. Dexter,et al.  Advances in characterization of soil structure , 1988 .

[53]  B. Ellert,et al.  Calculation of organic matter and nutrients stored in soils under contrasting management regimes , 1995 .

[54]  L. Manrique,et al.  BULK DENSITY OF SOILS IN RELATION TO SOIL PHYSICAL AND CHEMICAL PROPERTIES , 1991 .

[55]  Mathieu Vrac,et al.  Statistical downscaling of river flows , 2010 .

[56]  Cândida Ferreira,et al.  Gene Expression Programming: A New Adaptive Algorithm for Solving Problems , 2001, Complex Syst..

[57]  Guosheng Li,et al.  Pedotransfer Functions for Estimating Soil Bulk Density: A Case Study in the Three-River Headwater Region of Qinghai Province, China , 2016 .