Rapid prediction of total petroleum hydrocarbons concentration in contaminated soil using vis-NIR spectroscopy and regression techniques.

Visible and near infrared spectrometry (vis-NIRS) coupled with data mining techniques can offer fast and cost-effective quantitative measurement of total petroleum hydrocarbons (TPH) in contaminated soils. Literature showed however significant differences in the performance on the vis-NIRS between linear and non-linear calibration methods. This study compared the performance of linear partial least squares regression (PLSR) with a nonlinear random forest (RF) regression for the calibration of vis-NIRS when analysing TPH in soils. 88 soil samples (3 uncontaminated and 85 contaminated) collected from three sites located in the Niger Delta were scanned using an analytical spectral device (ASD) spectrophotometer (350-2500nm) in diffuse reflectance mode. Sequential ultrasonic solvent extraction-gas chromatography (SUSE-GC) was used as reference quantification method for TPH which equal to the sum of aliphatic and aromatic fractions ranging between C10 and C35. Prior to model development, spectra were subjected to pre-processing including noise cut, maximum normalization, first derivative and smoothing. Then 65 samples were selected as calibration set and the remaining 20 samples as validation set. Both vis-NIR spectrometry and gas chromatography profiles of the 85 soil samples were subjected to RF and PLSR with leave-one-out cross-validation (LOOCV) for the calibration models. Results showed that RF calibration model with a coefficient of determination (R2) of 0.85, a root means square error of prediction (RMSEP) 68.43mgkg-1, and a residual prediction deviation (RPD) of 2.61 outperformed PLSR (R2=0.63, RMSEP=107.54mgkg-1 and RDP=2.55) in cross-validation. These results indicate that RF modelling approach is accounting for the nonlinearity of the soil spectral responses hence, providing significantly higher prediction accuracy compared to the linear PLSR. It is recommended to adopt the vis-NIRS coupled with RF modelling approach as a portable and cost effective method for the rapid quantification of TPH in soils.

[1]  Somenath Mitra,et al.  Sample preparation techniques in analytical chemistry , 2003 .

[2]  B. Lendl,et al.  Challenges in the determination of petroleum hydrocarbons in water by gas chromatography (hydrocarbon index) , 2013 .

[3]  Abdul Mounem Mouazen,et al.  Combined Effects of Oil Concentration, Clay and Moisture Contents on Diffuse Reflectance Spectra of Diesel-Contaminated Soils , 2013, Water, Air, & Soil Pollution.

[4]  F. Lehmann,et al.  HyMap hyperspectral remote sensing to detect hydrocarbons , 2001 .

[5]  Tom Fearn,et al.  Practical Nir Spectroscopy With Applications in Food and Beverage Analysis , 1993 .

[6]  Frederic Coulon,et al.  Analysis of petroleum-contaminated soils by diffuse reflectance spectroscopy and sequential ultrasonic solvent extraction-gas chromatography. , 2014, Environmental pollution.

[7]  John B. Wright,et al.  Geology and Mineral Resources of West Africa , 1985 .

[8]  S. Thien A flow diagram for teaching texture‐by‐feel analysis , 1979 .

[9]  R. Clark,et al.  High spectral resolution reflectance spectroscopy of minerals , 1990 .

[10]  R. V. Rossel,et al.  Visible, near infrared, mid infrared or combined diffuse reflectance spectroscopy for simultaneous assessment of various soil properties , 2006 .

[11]  A. Caudy,et al.  Targeted metabolomics in cultured cells and tissues by mass spectrometry: method development and validation. , 2014, Analytica chimica acta.

[12]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[13]  Jerry Workman,et al.  Practical guide to interpretive near-infrared spectroscopy , 2007 .

[14]  S. Chakraborty,et al.  Development of a hybrid proximal sensing method for rapid identification of petroleum contaminated soils. , 2015, The Science of the total environment.

[15]  R. V. Rossel,et al.  Visible and near infrared spectroscopy in soil science , 2010 .

[16]  E. Brevik,et al.  Soils and Human Health , 2012, The Soil–Human Health Nexus.

[17]  D. B. Myers,et al.  Modelling soil carbon fractions with visible near-infrared (VNIR) and mid-infrared (MIR) spectroscopy , 2015 .

[18]  F. Nachtergaele Soil taxonomy—a basic system of soil classification for making and interpreting soil surveys: Second edition, by Soil Survey Staff, 1999, USDA–NRCS, Agriculture Handbook number 436, Hardbound , 2001 .

[19]  Abdul Mounem Mouazen,et al.  Predictive performance of mobile vis-near infrared spectroscopy for key soil properties at different geographical scales by using spiking and data mining techniques , 2017 .

[20]  Svante Wold,et al.  Personal memories of the early PLS development , 2001 .

[21]  Tormod Næs,et al.  A user-friendly guide to multivariate calibration and classification , 2002 .

[22]  H. Ramon,et al.  On-line measurement of some selected soil properties using a VIS–NIR sensor , 2007 .

[23]  L. A. Stone,et al.  Computer Aided Design of Experiments , 1969 .

[24]  Karl H. Norris,et al.  Understanding and Correcting the Factors Which Affect Diffuse Transmittance Spectra , 2001 .

[25]  F. Coulon,et al.  Determination of Petroleum Hydrocarbon Compounds from Soils and Sediments Using Ultrasonic Extraction , 2014 .

[26]  Yufeng Ge,et al.  Rapid identification of oil-contaminated soils using visible near-infrared diffuse reflectance spectroscopy. , 2010, Journal of environmental quality.

[27]  R. J. Gilkes,et al.  An infrared spectroscopic test for total petroleum hydrocarbon (TPH) contamination in soils , 2010 .

[28]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[29]  Henning Buddenbaum,et al.  The Effects of Spectral Pretreatments on Chemometric Analyses of Soil Profiles Using Laboratory Imaging Spectroscopy , 2012 .

[30]  O. Mullins,et al.  The Electronic Absorption Edge of Petroleum , 1992 .

[31]  H. Ramon,et al.  Comparison among principal component, partial least squares and back propagation neural network analyses for accuracy of measurement of selected soil properties with visible and near infrared spectroscopy , 2010 .

[32]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[33]  R. V. Rossel,et al.  Using data mining to model and interpret soil diffuse reflectance spectra. , 2010 .

[34]  A. McBratney,et al.  Near-infrared (NIR) and mid-infrared (MIR) spectroscopic techniques for assessing the amount of carbon stock in soils – Critical review and research perspectives , 2011 .

[35]  Henning Buddenbaum,et al.  Estimating the soil clay content and organic matter by means of different calibration methods of vis-NIR diffuse reflectance spectroscopy , 2016 .

[36]  Jing Wang,et al.  Mapping soil organic matter based on land degradation spectral response units using Hyperion images , 2010, Int. J. Appl. Earth Obs. Geoinformation.

[37]  J. Peters,et al.  Random forests as a tool for ecohydrological distribution modelling , 2007 .

[38]  H. Ramon,et al.  Towards development of on-line soil moisture content sensor using a fibre-type NIR spectrophotometer , 2005 .

[39]  S. Pollard,et al.  Development of an analytical procedure for weathered hydrocarbon contaminated soils within a UK risk-based framework. , 2008, Analytical chemistry.

[40]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.