Integrated data fusion and mining techniques for monitoring total organic carbon concentrations in a lake

Monitoring water quality on a near-real-time basis to address water resource management and public health concerns in coupled natural systems and the built environment is by no means an easy task. Total organic carbon (TOC) in surface waters is a known precursor of disinfection by-products in drinking water treatment such as total trihalomethanes (TTHMs), which are a suspected carcinogen and have been related to birth defects if water treatment plants cannot remove them. In this paper, an early warning system using integrated data fusion and mining (IDFM) techniques was proposed to estimate spatiotemporal distributions of TOC on a daily basis for monitoring water quality in a lake that serves as the source of a drinking water treatment plant. Landsat satellite images have high spatial resolution, but such application suffers from a long overpass interval of 16 days. On the other hand, coarse-resolution sensors with frequent revisit times, such as MODIS, are incapable of providing detailed water quality information because of low spatial resolution. This issue can be resolved by using data or sensor fusion techniques, such as IDFM, in which the high-spatial-resolution Landsat and the high-temporal-resolution MODIS images are fused and analysed by a suite of regression models to optimally produce synthetic images with both high spatial and temporal resolution. Analysis of the results using four statistical indices confirmed that the genetic programming model can accurately estimate the spatial and temporal variations of TOC concentrations in a small lake. The model entails a slight bias towards overestimating TOC, and it requires cloud-free input data for the lake. The IDFM efforts lead to the reconstruction of the spatiotemporal TOC distributions in a lake in support of healthy drinking water treatment.

[1]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[2]  K. Baker,et al.  The bio‐optical state of ocean waters and remote sensing 1 , 1978 .

[3]  J M Davis,et al.  The USACE (United States Army Corps of Engineers) in the Middle East - Benefits and Experiences for Future Construction Challenges. , 1984 .

[4]  D. Deamer,et al.  pH-dependent fusion of liposomes using titratable polycations. , 1985, Biochemistry.

[5]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[6]  K. Heikkinen Organic carbon transport in an undisturbed boreal humic river in northern Finland , 1989, Archiv für Hydrobiologie.

[7]  P. Naden,et al.  Statistical modelling of water colour in the uplands: The Upper Midd catchment 1979-1987. , 1989, Environmental pollution.

[8]  G. Likens,et al.  Spectral reflectance and water quality of Adirondack mountain region lakes , 1989 .

[9]  Lawrence. Davis,et al.  Handbook Of Genetic Algorithms , 1990 .

[10]  D. L. Hall,et al.  Mathematical Techniques in Multisensor Data Fusion , 1992 .

[11]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[12]  Temporal variations of organic carbon in the River Öre, Northern Sweden , 1994 .

[13]  Marc Mangolini,et al.  Apport de la fusion d'images satellitaires multicapteurs au niveau pixel en télédétection et photo-interprétation , 1994 .

[14]  J. L. van Genderen,et al.  Image fusion : issues, techniques and applications , 1994 .

[15]  M. Dosskey,et al.  Forest sources and pathways of organic matter transport to a blackwater stream: a hydrologic approach , 1994 .

[16]  Martin J. Christ,et al.  Dynamics of extractable organic carbon in Spodosol forest floors , 1996 .

[17]  William M. Lewis,et al.  Determination of chlorophyll and dissolved organic carbon from reflectance data for Colorado reservoirs , 1996 .

[18]  C. Justice,et al.  Atmospheric correction of visible to middle-infrared EOS-MODIS data over land surfaces: Background, operational algorithm and validation , 1997 .

[19]  Hiroaki Kitano,et al.  RoboCup: The Robot World Cup Initiative , 1997, AGENTS '97.

[20]  L. Baker,et al.  Sources and transport of organic carbon in an Arizona river-reservoir system , 1997 .

[21]  Christine Pohl,et al.  Multisensor image fusion in remote sensing: concepts, methods and applications , 1998 .

[22]  E. Tipping,et al.  Concentrations and fluxes of dissolved organic carbon in drainage water from an upland peat system , 1998 .

[23]  C. Stedmon,et al.  Optical properties and signatures of chromophoric dissolved organic matter (CDOM) in Danish coastal waters , 2000 .

[24]  Sean Luke,et al.  Issues in Scaling Genetic Programming: Breeding Strategies, Tree Generation, and Bloat , 2000 .

[25]  Andries P. Engelbrecht,et al.  Computational Intelligence: An Introduction , 2002 .

[26]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[27]  Sonya A. H. McMullen,et al.  Mathematical Techniques in Multisensor Data Fusion (Artech House Information Warfare Library) , 2004 .

[28]  R. Vincent,et al.  Phycocyanin detection from LANDSAT TM data for mapping cyanobacterial blooms in Lake Erie , 2004 .

[29]  A. Robin,et al.  A multiscale multitemporal land cover classification method using a Bayesian approach , 2005, SPIE Remote Sensing.

[30]  John Fulcher,et al.  Computational Intelligence: An Introduction , 2008, Computational Intelligence: A Compendium.

[31]  L. Campanella,et al.  Organic carbons and TOC in waters: an overview of the international norm for its measurements , 2005 .

[32]  Robert E. Wolfe,et al.  A Landsat surface reflectance dataset for North America, 1990-2000 , 2006, IEEE Geoscience and Remote Sensing Letters.

[33]  Mathew R. Schwaller,et al.  On the blending of the Landsat and MODIS surface reflectance: predicting daily Landsat surface reflectance , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[34]  J. L. van Genderen,et al.  Comparison and analysis of remote sensing data fusion techniques at feature and decision levels , 2006 .

[35]  Marvin E. Bauer,et al.  Influence of Chlorophyll and Colored Dissolved Organic Matter (CDOM) on Lake Reflectance Spectra: Implications for Measuring Lake Properties by Remote Sensing , 2006 .

[36]  Jiajin Le,et al.  RLGP: An Efficient Method to Avoid Code Bloating on Genetic Programming , 2007, 2007 International Conference on Mechatronics and Automation.

[37]  R. Doerffer,et al.  The MERIS Case 2 water algorithm , 2007 .

[38]  J. Seibert,et al.  Seasonal and runoff-related changes in total organic carbon concentrations in the River Öre, Northern Sweden , 2008, Aquatic Sciences.

[39]  Li Chen,et al.  Improvement of remote monitoring on water quality in a subtropical reservoir by incorporating grammatical evolution with parallel genetic algorithms into satellite imagery. , 2008, Water research.

[40]  Yasuhiro Ohmoria,et al.  FEASIBILITY STUDY OF TOC AND C/N RATIO ESTIMATION FROM MULTI- SPECTRAL REMOTE SENSING DATA , 2010 .

[41]  Adem Bayram,et al.  Variation of total organic carbon content along the stream Harsit, Eastern Black Sea Basin, Turkey , 2011, Environmental monitoring and assessment.

[42]  I. Ioannou,et al.  Neural network approach to retrieve the inherent optical properties of the ocean from observations of MODIS. , 2011, Applied optics.

[43]  M. Farooqi DATA MINING : AN OVERVIEW , 2012 .

[44]  Lin Li,et al.  Hyperspectral determination of eutrophication for a water supply source via genetic algorithm-partial least squares (GA-PLS) modeling. , 2012, The Science of the total environment.

[45]  Ni-Bin Chang,et al.  Monitoring the total organic carbon concentrations in a lake with the integrated data fusion and machine-learning (IDFM) technique , 2012, Other Conferences.

[46]  F. J. D. C. Juez,et al.  Hybrid modelling based on support vector regression with genetic algorithms in forecasting the cyanotoxins presence in the Trasona reservoir (Northern Spain) , 2013 .

[47]  P. J. García Nieto,et al.  Hybrid modelling based on support vector regression with genetic algorithms in forecasting the cyanotoxins presence in the Trasona reservoir ( Northern Spain ) , 2013 .

[48]  Jeffrey W. Seifert Crs Report for Congress: Data Mining: An Overview: December 16, 2004 - Rl31798 , 2013 .

[49]  B. Vannah Integrated Data Fusion And Mining (idfm) Technique For Monitoring Water Quality In Large And Small Lakes , 2013 .

[50]  Ni-Bin Chang,et al.  Exploring spatiotemporal patterns of phosphorus concentrations in a coastal bay with MODIS images and machine learning models , 2013 .

[51]  R. Flournoy National Primary Drinking Water Regulations : Disinfectants and Disinfection Byproducts Notice of Data , 2023 .

[52]  Thompson,et al.  Estimation of Particulate Organic Carbon in the Ocean from Satellite Remote Sensing , 2022 .