Using deep learning to predict soil properties from regional spectral data

Abstract Diffuse reflectance infrared spectroscopy allows the rapid acquisition of soil information in the field or the laboratory. The vis-NIR spectroscopy research enthusiasm around the world has created regional to global soil spectral libraries. While machine learning methods have been utilised in processing spectral data, such large regional datasets are better dealt with big data analytics. Deep learning is an exciting discipline that has already transformed the way data are analysed in many fields and could also change the way we model soil spectral data. This study developed and evaluated convolutional neural networks (CNNs), a type of deep learning algorithm, as a new way to predict soil properties from raw soil spectra. We demonstrated the effectiveness of CNNs on the LUCAS soil database, which consists of around 20,000 topsoil observations with physicochemical and biological properties from Europe. To fully utilise the capacity of the CNN model, we represented the soil spectral data as a 2-dimensional spectrogram, showing the reflectance as a function of wavelength and frequency. We showed the capacity of a CNN to be trained in a multi-task setting to simultaneously predict six soil properties in one model (OC, CEC, clay, sand, pH, total N). Compared with conventional methods such as PLS regression and Cubist regression tree, the CNN performed significantly better, especially the multi-tasking model. In the case of soil organic carbon prediction, the multi-task CNN decreased the error by 87% compared to PLS and 62% compared with Cubist. This approach proved to be effective when trained on a relatively large dataset. The high accuracy of CNN makes it an ideal tool for modelling soil spectral data.

[1]  R. M. Lark,et al.  Improved analysis and modelling of soil diffuse reflectance spectra using wavelets , 2009 .

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  Suhas P. Wani,et al.  Variable indicators for optimum wavelength selection in diffuse reflectance spectroscopy of soils , 2016 .

[4]  Vijay S. Pande,et al.  Massively Multitask Networks for Drug Discovery , 2015, ArXiv.

[5]  R. V. Rossel,et al.  Using data mining to model and interpret soil diffuse reflectance spectra. , 2010 .

[6]  Alex B. McBratney,et al.  Simultaneous estimation of several soil properties by ultra-violet, visible, and near-infrared reflectance spectroscopy , 2003 .

[7]  C. Hurburgh,et al.  Near-Infrared Reflectance Spectroscopy–Principal Components Regression Analyses of Soil Properties , 2001 .

[8]  R. Barnes,et al.  Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra , 1989 .

[9]  J. M. Soriano-Disla,et al.  The Performance of Visible, Near-, and Mid-Infrared Reflectance Spectroscopy for Prediction of Soil Physical, Chemical, and Biological Properties , 2014 .

[10]  Luca Montanarella,et al.  Prediction of Soil Organic Carbon at the European Scale by Visible and Near InfraRed Reflectance Spectroscopy , 2013, PloS one.

[11]  A. McBratney,et al.  Near-infrared (NIR) and mid-infrared (MIR) spectroscopic techniques for assessing the amount of carbon stock in soils – Critical review and research perspectives , 2011 .

[12]  R. V. Rossel,et al.  Visible and near infrared spectroscopy in soil science , 2010 .

[13]  R. V. Rossel,et al.  In situ measurements of soil colour, mineral composition and clay content by vis–NIR spectroscopy , 2009 .

[14]  R. Henry,et al.  Simultaneous Determination of Moisture, Organic Carbon, and Total Nitrogen by Near Infrared Reflectance Spectrophotometry , 1986 .

[15]  T. Teichmann,et al.  The Measurement of Power Spectra , 1960 .

[16]  Cleiton H. Sequeira,et al.  Overview of the U.S. Rapid Carbon Assessment Project: Sampling Design, Initial Summary and Uncertainty Estimates , 2014 .

[17]  Jean-Philippe Gras,et al.  Best practices for obtaining and processing field visible and near infrared (VNIR) spectra of topsoils , 2014 .

[18]  R. Kodešová,et al.  Simple but efficient signal pre-processing in soil organic carbon spectroscopic estimation , 2017 .

[19]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[22]  Jae Lim,et al.  Signal estimation from modified short-time Fourier transform , 1984 .

[23]  Eyal Ben-Dor,et al.  Internal soil standard method for the Brazilian soil spectral library: Performance and proximate analysis , 2018 .

[24]  Budiman Minasny,et al.  Soil carbon 4 per mille , 2017 .

[25]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[26]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[27]  Eyal Ben-Dor,et al.  Near-Infrared Analysis as a Rapid Method to Simultaneously Evaluate Several Soil Properties , 1995 .

[28]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[29]  K. Shepherd,et al.  Development of Reflectance Spectral Libraries for Characterization of Soil Properties , 2002 .

[30]  Viacheslav I. Adamchuk,et al.  A global spectral library to characterize the world’s soil , 2016 .

[31]  K. Shepherd,et al.  Global soil characterization with VNIR diffuse reflectance spectroscopy , 2006 .

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Budiman Minasny,et al.  Mid-infrared spectroscopy and partial least-squares regression to estimate soil arsenic at a highly variable arsenic-contaminated site , 2015, International Journal of Environmental Science and Technology.

[34]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[35]  G. McCarty,et al.  Mid-Infrared and Near-Infrared Diffuse Reflectance Spectroscopy for Soil Carbon Measurement , 2002 .

[36]  Gang Wang,et al.  Deep Learning-Based Classification of Hyperspectral Data , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[37]  Rebecca L. Whetton,et al.  Machine learning based prediction of soil total nitrogen, organic carbon and moisture content by using VIS-NIR spectroscopy , 2016 .

[38]  B. Minasny,et al.  Regression rules as a tool for predicting soil properties from infrared reflectance spectroscopy , 2008 .

[39]  Sven Behnke,et al.  Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.

[40]  L. Lin,et al.  A concordance correlation coefficient to evaluate reproducibility. , 1989, Biometrics.

[41]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[42]  Luca Montanarella,et al.  Global soil organic carbon assessment , 2015 .

[43]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[44]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[45]  Esben Jannik Bjerrum,et al.  Data Augmentation of Spectral Data for Convolutional Neural Network (CNN) Based Deep Chemometrics , 2017, ArXiv.