A Transfer Learning Approach Utilizing Combined Artificial Samples for Improved Robustness of Model to Estimate Heavy Metal Contamination in Soil

Benefiting from the nanoscale sampling intervals and subtle spectral information in the visible and near-infrared band, hyperspectral technology is considered as an efficient means for monitoring soil heavy metal contamination whereby the good robustness of prediction model is driven by the increase to spectral dimension in model analysis. Considering the positive correlation between samples size and spectral dimension, we focuses on a novel derivation of enlarging samples size in this study to improve model performance by i) preparing artificial samples taking into account of flexibility and control over the laboratory environment compared with collecting wild samples, and ii) using transfer learning method called transfer component analysis (TCA) for reducing spectral feature differences caused by soil heterogeneity to train model in the same data distribution. The proposed approach was tested on three heavy metals, namely copper (Cu), cadmium (Cd) and lead (Pb), collected in the mining area located in the Xiangjiang Basin, Hunan Province, China. The experiments showed that the initial model constructed by a small number of wild samples performed strong prediction sensitivity as the training samples change. In contrast, a modified model with TCA could showed good robustness with excellent predicted ability, the average prediction accuracy of the determinable coefficient (R2) and the ratio of prediction to deviation (RPD) improved to 0.73 and 1.90, 0.74 and 1.92, 0.72 and 1.73, respectively. The results illustrated there was a more reliable modeling method in potential to predict soil heavy metals based on hyperspectral analysis at low cost.

[1]  Shahla Hosseini Bai,et al.  Prediction of soil macro- and micro-elements in sieved and ground air-dried soils using laboratory-based hyperspectral imaging technique , 2019, Geoderma.

[2]  Xia Zhang,et al.  Predicting cadmium concentration in soils using laboratory and field reflectance spectroscopy. , 2019, The Science of the total environment.

[3]  Daniel McNeish,et al.  Missing data methods for arbitrary missingness with small samples , 2017 .

[4]  Feiping Nie,et al.  Robust Matrix Completion via Joint Schatten p-Norm and lp-Norm Minimization , 2012, 2012 IEEE 12th International Conference on Data Mining.

[5]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[6]  Huihui Feng,et al.  Multisource spectral-integrated estimation of cadmium concentrations in soil using a direct standardization and Spiking algorithm. , 2019, The Science of the total environment.

[7]  Qinghu Jiang,et al.  Estimation of soil organic carbon and total nitrogen in different soil layers using VNIR spectroscopy: Effects of spiking on model applicability , 2017 .

[8]  Zhongqiu Zhao,et al.  Heavy metal accumulation and its spatial distribution in agricultural soils: evidence from Hunan province, China , 2018, RSC advances.

[9]  Zhou Shi,et al.  Prediction of soil organic matter using a spatially constrained local partial least squares regression and the Chinese vis–NIR spectral library , 2015 .

[10]  Xia Zhang,et al.  Predicting nickel concentration in soil using reflectance spectroscopy associated with organic matter and clay minerals , 2018, Geoderma.

[11]  M. Jalali,et al.  Effect of heavy metals on pH buffering capacity and solubility of Ca, Mg, K, and P in non-spiked and heavy metal-spiked soils , 2016, Environmental Monitoring and Assessment.

[12]  Guofeng Wu,et al.  Visible and near-infrared reflectance spectroscopy-an alternative for monitoring soil contamination by heavy metals. , 2014, Journal of hazardous materials.

[13]  Wouter Saeys,et al.  Potential for Onsite and Online Analysis of Pig Manure using Visible and Near Infrared Reflectance Spectroscopy , 2005 .

[14]  Zhengrong Zou,et al.  A transferable spectroscopic diagnosis model for predicting arsenic contamination in soil. , 2019, The Science of the total environment.

[15]  Yuanfeng Pan,et al.  Controlled release of agrochemicals and heavy metal ion capture dual-functional redox-responsive hydrogel for soil remediation. , 2018, Chemical communications.

[16]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Junjie Wang,et al.  Spectroscopic Diagnosis of Arsenic Contamination in Agricultural Soils , 2017, Sensors.

[18]  A. Renzaho,et al.  Human health risks and socio-economic perspectives of arsenic exposure in Bangladesh: A scoping review. , 2018, Ecotoxicology and environmental safety.

[19]  M. Vrvić,et al.  Assessment of Ecological Risk of Heavy Metal Contamination in Coastal Municipalities of Montenegro , 2016, International journal of environmental research and public health.

[20]  Xia Zhang,et al.  Heavy metal pollution at mine sites estimated from reflectance spectroscopy following correction for skewed data. , 2019, Environmental pollution.

[21]  Anne Probst,et al.  Metal contamination of soils and crops affected by the Chenzhou lead/zinc mine spill (Hunan, China). , 2005, The Science of the total environment.

[22]  Jing Liu,et al.  Adsorption kinetic and species variation of arsenic for As(V) removal by biologically mackinawite (FeS) , 2018, Chemical Engineering Journal.

[23]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[24]  T. Zaleski,et al.  Soil pollution indices conditioned by medieval metallurgical activity - A case study from Krakow (Poland). , 2016, Environmental pollution.

[25]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[26]  D. F. Malley,et al.  Use of Near-Infrared Reflectance Spectroscopy in Prediction of Heavy Metals in Freshwater Sediment by Their Association with Organic Matter , 1997 .

[27]  Jay Gao,et al.  Hyperspectral sensing of heavy metals in soil and vegetation: Feasibility and challenges , 2018 .

[28]  Xiaoyong Liao,et al.  Heavy metal pollution of soils and vegetables in the midstream and downstream of the Xiangjiang River, Hunan Province , 2008 .

[29]  Paul T. von Hippel,et al.  New Confidence Intervals and Bias Comparisons Show That Maximum Likelihood Can Beat Multiple Imputation in Small Samples , 2013, 1307.5875.

[30]  Abdul Mounem Mouazen,et al.  Predictive performance of mobile vis-near infrared spectroscopy for key soil properties at different geographical scales by using spiking and data mining techniques , 2017 .

[31]  P. Williams,et al.  Near-infrared Technology: Getting the best out of light , 2019 .

[32]  Kun Tan,et al.  Estimation of heavy metal concentrations in reclaimed mining soils using reflectance spectroscopy. , 2014, Guang pu xue yu guang pu fen xi = Guang pu.

[33]  B. Stenberg,et al.  Near‐infrared spectroscopy for within‐field soil characterization: small local calibrations compared with national libraries spiked with local samples , 2010 .

[34]  Jeffrey R. Harring,et al.  Correcting Model Fit Criteria for Small Sample Latent Growth Models With Incomplete Data , 2017, Educational and psychological measurement.

[35]  Qiang Yang,et al.  Transfer Learning via Dimensionality Reduction , 2008, AAAI.

[36]  Freek D. van der Meer,et al.  Mapping of heavy metal pollution in stream sediments using combined geochemistry, field spectroscopy, and hyperspectral remote sensing: A case study of the Rodalquilar mining area, SE Spain , 2008 .

[37]  Huihui Feng,et al.  Spatial distribution mapping of Hg contamination in subclass agricultural soils using GIS enhanced multiple linear regression , 2019, Journal of Geochemical Exploration.

[38]  Lei Huang,et al.  A review of soil heavy metal pollution from industrial and agricultural regions in China: Pollution and risk assessment. , 2018, The Science of the total environment.

[39]  Wang Li-xia,et al.  Heavy metal pollution of soils and vegetables in the midstream and downstream of the Xiangjiang River, Hunan Province , 2008 .