Hyperspectral Data and Machine Learning for Estimating CDOM, Chlorophyll a, Diatoms, Green Algae and Turbidity

Inland waters are of great importance for scientists as well as authorities since they are essential ecosystems and well known for their biodiversity. When monitoring their respective water quality, in situ measurements of water quality parameters are spatially limited, costly and time-consuming. In this paper, we propose a combination of hyperspectral data and machine learning methods to estimate and therefore to monitor different parameters for water quality. In contrast to commonly-applied techniques such as band ratios, this approach is data-driven and does not rely on any domain knowledge. We focus on CDOM, chlorophyll a and turbidity as well as the concentrations of the two algae types, diatoms and green algae. In order to investigate the potential of our proposal, we rely on measured data, which we sampled with three different sensors on the river Elbe in Germany from 24 June–12 July 2017. The measurement setup with two probe sensors and a hyperspectral sensor is described in detail. To estimate the five mentioned variables, we present an appropriate regression framework involving ten machine learning models and two preprocessing methods. This allows the regression performance of each model and variable to be evaluated. The best performing model for each variable results in a coefficient of determination R2 in the range of 89.9% to 94.6%. That clearly reveals the potential of the machine learning approaches with hyperspectral data. In further investigations, we focus on the generalization of the regression framework to prepare its application to different types of inland waters.

[1]  C. Schubert,et al.  Aquatic Ecosystems: Interactivity of dissolved organic matter , 2003 .

[2]  Marvin E. Bauer,et al.  Influence of Chlorophyll and Colored Dissolved Organic Matter (CDOM) on Lake Reflectance Spectra: Implications for Measuring Lake Properties by Remote Sensing , 2006 .

[3]  R. Bukata Retrospection and introspection on remote sensing of inland water quality: “Like Déjà Vu All Over Again” , 2013 .

[4]  S. Postel Entering an era of water scarcity: the challenges ahead. , 2000 .

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Lijing Wang,et al.  Three Gorges Reservoir: density pump amplification of pollutant transport into tributaries. , 2014, Environmental science & technology.

[7]  Hermann Kaufmann,et al.  Monitoring of Lake Water Qualitiy Using Hyperspectral CHRIS-Proba Data , 2004 .

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[9]  M. Bauer,et al.  Factors affecting the measurement of CDOM by remote sensing of optically complex inland waters , 2015 .

[10]  Lars-Anders Hansson,et al.  Environmental issues in lakes and ponds: current state and perspectives , 2002, Environmental Conservation.

[11]  John F. Schalles,et al.  Remote measurement of algal chlorophyll in surface waters: The case for the first derivative of reflectance near 690 nm , 1996 .

[12]  D. Washington,et al.  Standard Methods for the Examination of Water and Wastewater , 1971 .

[13]  Chein-I Chang,et al.  Unsupervised hyperspectral image analysis with projection pursuit , 2000, IEEE Trans. Geosci. Remote. Sens..

[14]  Toby Tyrrell,et al.  Role of diatoms in regulating the ocean's silicon cycle , 2003 .

[15]  Sina Keller,et al.  Introducing a Framework of Self-Organizing Maps for Regression of Soil Moisture with Hyperspectral Data , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.

[16]  R. A. Neville,et al.  Passive remote sensing of phytoplankton via chlorophyll α fluorescence , 1977 .

[17]  R. N. Fraser,et al.  Hyperspectral remote sensing of turbidity and chlorophyll a among Nebraska Sand Hills lakes , 1998 .

[18]  S. Phinn,et al.  A review of ocean color remote sensing methods and statistical techniques for the detection, mapping and analysis of phytoplankton blooms in coastal and open oceans , 2014 .

[19]  M. Furnas In situ growth rates of marine phytoplankton: approaches to measurement, community and species growth rates , 1990 .

[20]  Michael A. Borowitzka,et al.  Chlorophyll a Fluorescence in Aquatic Sciences: Methods and Applications , 2010 .

[21]  Anatoly A. Gitelson,et al.  The peak near 700 nm on radiance spectra of algae and water: relationships of its magnitude and position with chlorophyll concentration , 1992 .

[22]  L. Breiman Arcing the edge , 1997 .

[23]  S. Hinz,et al.  Estimation of Chlorophyll a, Diatoms and Green Algae Based on Hyperspectral Data with Machine Learning Approaches , 2018 .

[24]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[25]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[26]  Awwa,et al.  Standard Methods for the examination of water and wastewater , 1999 .

[27]  Felix M. Riese,et al.  Modeling Subsurface Soil Moisture Based on Hyperspectral Data : First Results of a Multilateral Field Campaign , 2018 .

[28]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[29]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[30]  Stephanie C. J. Palmer,et al.  Remote sensing of inland waters: Challenges, progress and future directions , 2015 .

[31]  Stefan Hinz,et al.  DEVELOPING A MACHINE LEARNING FRAMEWORK FOR ESTIMATING SOIL MOISTURE WITH VNIR HYPERSPECTRAL DATA , 2018, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[32]  John P. Smol,et al.  The diatoms: applications for the environmental and earth sciences , 2012 .

[33]  Peter D. Hunter,et al.  Spectral discrimination of phytoplankton colour groups: The effect of suspended particulate matter and sensor spectral resolution , 2008 .

[34]  Robert F. Chen,et al.  Functional linear analysis of in situ hyperspectral data for assessing CDOM in rivers. , 2010 .

[35]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[36]  A. Gitelson,et al.  ESTIMATION OF CHLOROPHYLL a FROM TIME SERIES MEASUREMENTS OF HIGH SPECTRAL RESOLUTION REFLECTANCE IN AN EUTROPHIC LAKE , 1998 .

[37]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[38]  Wim Klaassen,et al.  The contribution of ocean‐leaving DMS to the global atmospheric burdens of DMS, MSA, SO2, and NSS SO4= , 2003 .

[39]  Sina Keller,et al.  Machine Learning Regression on Hyperspectral Data to Estimate Multiple Water Parameters , 2018, 2018 9th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS).

[40]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[41]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[42]  Gang Wang,et al.  Deep Learning-Based Classification of Hyperspectral Data , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.