Estimation of the spatial distribution of heavy metal in agricultural soils using airborne hyperspectral imaging and random forest.

Hyperspectral imaging, with the hundreds of bands and high spectral resolution, offers a promising approach for estimation of heavy metal concentration in agricultural soils. Using airborne imagery over a large-scale area for fast retrieval is of great importance for environmental monitoring and further decision support. However, few studies have focused on the estimation of soil heavy metal concentration by airborne hyperspectral imaging. In this study, we utilized the airborne hyperspectral data in LiuXin Mine of China obtained from HySpex VNIR-1600 and HySpex SWIR-384 sensor to establish the spectral-analysis-based model for retrieval of heavy metals concentration. Firstly, sixty soil samples were collected in situ, and their heavy metal concentrations (Cr, Cu, Pb) were determined by inductively coupled plasma-mass spectrometry analysis. Due to mixed pixels widespread in airborne hyperspectral images, spectral unmixing was conducted to obtain purer spectra of the soil and to improve the estimation accuracy. Ten of estimated models, including four different random forest models (RF)-standard random forest (SRF), regularized random forest (RRF), guided random forest (GRF), and guided regularized random forest (GRRF)-were introduced for hyperspectral estimated model in this paper. Compared with the estimation results, the best accuracy for Cr, Cu, and Pb is obtained by RF. It shows that RF can predict the three heavy metals better than other models in this area. For Cr, Cu, Pb, the best model of RF yields Rp2 values of 0.75,0.68 and 0.74 respectively, and the values of RMSEp are 5.62, 8.24, and 2.81 (mg/kg), respectively. The experiments show the average estimated values are close to the truth condition and the high estimated values concentrated near several industries, valifating the effectiveness of the presented method.

[1]  Peter Bühlmann Regression shrinkage and selection via the Lasso: a retrospective (Robert Tibshirani): Comments on the presentation , 2011 .

[2]  Guofeng Wu,et al.  Visible and near-infrared reflectance spectroscopy-an alternative for monitoring soil contamination by heavy metals. , 2014, Journal of hazardous materials.

[3]  Zhen Lin,et al.  Choosing Snps Using Feature Selection , 2006, J. Bioinform. Comput. Biol..

[4]  Elfatih M. Abdel-Rahman,et al.  Random forest regression and spectral band selection for estimating sugarcane leaf nitrogen concentration using EO-1 Hyperion hyperspectral data , 2013 .

[5]  Stefano Pignatti,et al.  Using imaging spectroscopy to map red mud dust waste: The Podgorica Aluminum Complex case study , 2012 .

[6]  O. Mutanga,et al.  Estimating standing biomass in papyrus (Cyperus papyrus L.) swamp: exploratory of in situ hyperspectral indices and random forest regression , 2014 .

[7]  S. Vincenzi,et al.  Application of a Random Forest algorithm to predict spatial distribution of the potential yield of Ruditapes philippinarum in the Venice lagoon, Italy , 2011 .

[8]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[9]  Ting Wang,et al.  Application of Breiman's Random Forest to Modeling Structure-Activity Relationships of Pharmaceutical Molecules , 2004, Multiple Classifier Systems.

[10]  N. Anantharaman,et al.  Heavy metal removal from copper smelting effluent using electrochemical cylindrical flow reactor. , 2008, Journal of hazardous materials.

[11]  Michael H. Cosh,et al.  Sub‐pixel reflectance unmixing in estimating vegetation water content and dry biomass of corn and soybeans cropland using normalized difference water index (NDWI) from satellites , 2009 .

[12]  Guofeng Wu,et al.  Estimation of arsenic in agricultural soils using hyperspectral vegetation indices of rice. , 2016, Journal of hazardous materials.

[13]  José M. Bioucas-Dias,et al.  Vertex component analysis: a fast algorithm to unmix hyperspectral data , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Fangbai Li,et al.  Using ensemble models to identify and apportion heavy metal pollution sources in agricultural soils on a local scale. , 2015, Environmental pollution.

[15]  G. Zeng,et al.  The effects of activated biochar addition on remediation efficiency of co-composting with contaminated wetland soil , 2019, Resources, Conservation and Recycling.

[16]  Gustavo Camps-Valls,et al.  Learning Relevant Image Features With Multiple-Kernel Classification , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[17]  George C. Runger,et al.  Gene selection with guided regularized random forest , 2012, Pattern Recognit..

[18]  Gabriel S. Amable,et al.  A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models , 2016, PloS one.

[19]  D. A. Stow,et al.  Using multiple image endmember spectral mixture analysis to study chaparral regrowth in southern California , 2003 .

[20]  Freek D. van der Meer,et al.  Mapping of heavy metal pollution in stream sediments using combined geochemistry, field spectroscopy, and hyperspectral remote sensing: A case study of the Rodalquilar mining area, SE Spain , 2008 .

[21]  Kwang In Kim,et al.  Single-Image Super-Resolution Using Sparse Regression and Natural Image Prior , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Huo Li-jiang Case Study of Life Cycle Assessment for Corrugated Board Box Production Technology , 2010 .

[23]  Zhongke Bai,et al.  The effects of coal gangue and fly ash on the hydraulic properties and water content distribution in reconstructed soil profiles of coal‐mined land with a high groundwater table , 2017 .

[24]  H. Zou,et al.  Addendum: Regularization and variable selection via the elastic net , 2005 .

[25]  Antonio J. Plaza,et al.  Minimum Volume Simplex Analysis: A Fast Algorithm for Linear Hyperspectral Unmixing , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[27]  P. Morgan,et al.  Evaluation of linear spectral unmixing and ΔNBR for predicting post‐fire recovery in a North American ponderosa pine forest , 2007 .

[28]  G. Zeng,et al.  Biological technologies for the remediation of co-contaminated soil , 2017, Critical reviews in biotechnology.

[29]  Silvia Serranti,et al.  Asbestos containing materials detection and classification by the use of hyperspectral imaging. , 2018, Journal of hazardous materials.

[30]  Junjie Wang,et al.  Spectroscopic Diagnosis of Arsenic Contamination in Agricultural Soils , 2017, Sensors.

[31]  Xia Zhang,et al.  Estimating soil zinc concentrations using reflectance spectroscopy , 2017, Int. J. Appl. Earth Obs. Geoinformation.

[32]  M. Kowalski Sparse regression using mixed norms , 2009 .