Abstract Hypotheses are of major importance in scientific research. In current applications of machine learning algorithms for soil mapping the hypotheses being tested or developed are often ambiguous or undefined. Mapping soil properties or classes, however, does not tell much about the dynamics and processes that underly soil genesis and evolution. When the interest in the soil map is for applications in a context different than soil science, such as for policy making or baseline production of quantitative soil information, the interpretation should be made in light of this application. If otherwise, we recommend soil scientists to provide hypotheses to accompany their research. The hypothesis is formulated at the beginning of the research and, in some cases, motivates data collection. Here we argue that when applying data-driven techniques such as machine learning, developing hypotheses can be a useful end point of the research. The spatial pattern predicted by the machine learning model and the correlation found among the covariates are an opportunity to develop hypotheses which are likely to require additional analyses and datasets to be tested. Systematically providing scientific hypotheses in digital soil mapping studies will enable the soil science community to build on previous work, and to increase the credibility of data-driven algorithms as a means to accelerate discovery on soil processes.
[1]
Dan Pennock,et al.
Designing field studies in soil science
,
2004
.
[2]
Rob Kitchin,et al.
The data revolution : big data, open data, data infrastructures & their consequences
,
2014
.
[3]
Philippe Lagacherie,et al.
Integrating pedological knowledge into digital soil mapping
,
2007
.
[4]
R. Hazen.
Data-driven abductive discovery in mineralogy
,
2014
.
[5]
Philippe Lagacherie,et al.
Chapter 1 Spatial Soil Information Systems and Spatial Soil Inference Systems: Perspectives for Digital Soil Mapping
,
2006
.
[6]
M. Goodchild,et al.
Data-driven geography
,
2014,
GeoJournal.
[7]
Marvin N. Wright,et al.
Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables
,
2018,
PeerJ.
[8]
Budiman Minasny,et al.
Pedology and digital soil mapping (DSM)
,
2019,
European Journal of Soil Science.
[9]
Laura Poggio,et al.
A note on knowledge discovery and machine learning in digital soil mapping
,
2019,
European Journal of Soil Science.
[10]
Padhraic Smyth,et al.
From Data Mining to Knowledge Discovery in Databases
,
1996,
AI Mag..