Comparative data mining analysis for information retrieval of MODIS images: monitoring lake turbidity changes at Lake Okeechobee, Florida

In the remote sensing field, a frequently recurring question is: Which computational intelligence or data mining algorithms are most suitable for the retrieval of essential information given that most natural systems exhibit very high non-linearity. Among potential candidates might be empirical regression, neural network model, support vector machine, genetic algorithm/genetic programming, analytical equation, etc. This paper compares three types of data mining techniques, including multiple non-linear regression, artificial neural networks, and genetic programming, for estimating multi-temporal turbidity changes following hurricane events at Lake Okeechobee, Florida. This retrospective analysis aims to identify how the major hurricanes impacted the water quality management in 2003-2004. The Moderate Resolution Imaging Spectroradiometer (MODIS) Terra 8-day composite imageries were used to retrieve the spatial patterns of turbidity distributions for comparison against the visual patterns discernible in the in-situ observations. By evaluating four statistical parameters, the genetic programming model was finally selected as the most suitable data mining tool for classification in which the MODIS band 1 image and wind speed were recognized as the major determinants by the model. The multi-temporal turbidity maps generated before and after the major hurricane events in 2003-2004 showed that turbidity levels were substantially higher after hurricane episodes. The spatial patterns of turbidity confirm that sediment-laden water travels to the shore where it reduces the intensity of the light necessary to submerged plants for photosynthesis. This reduction results in substantial loss of biomass during the post-hurricane period.

[1]  Xiao‐Hai Yan,et al.  A Neural Network Model for Estimating Sea Surface Chlorophyll and Sediments from Thematic Mapper Imagery , 1998 .

[2]  S. Peters,et al.  Comparison of remote sensing data, model results and in situ data for total suspended matter (TSM) in the southern Frisian lakes. , 2001, The Science of the total environment.

[3]  F. Muller‐Karger,et al.  Remote sensing of particle backscattering in Chesapeake Bay: a 6-year SeaWiFS retrospective view , 2007 .

[4]  E. J. Hannan,et al.  Non-linear time series regression , 1971, Journal of Applied Probability.

[5]  Françoise Fogelman-Soulié,et al.  Disordered Systems and Biological Organization , 1986, NATO ASI Series.

[6]  Chuanmin Hu,et al.  Remote sensing of water clarity in Tampa Bay , 2007 .

[7]  W. Philpot,et al.  Coastal and estuarine studies with ERTS-1 and Skylab , 1974 .

[8]  John A. Harrington,et al.  Remote sensing of temporal and spatial variations in pool size, suspended sediment, turbidity, and Secchi depth in Tuttle Creek Reservoir, Kansas: 1993 , 1998 .

[9]  N. Chang,et al.  Short-term streamflow forecasting with global climate change implications – A comparative study between genetic programming and neural network models , 2008 .

[10]  Rolf Stadler,et al.  Discovering Data Mining: From Concept to Implementation , 1997 .

[11]  Remo Guidieri Res , 1995, RES: Anthropology and Aesthetics.

[12]  Sampsa Koponen,et al.  Lake water quality classification with airborne hyperspectral spectrometer and simulated MERIS data , 2002 .

[13]  J. Shutler,et al.  Extending the MODIS 1 km ocean colour atmospheric correction to the MODIS 500 m bands and 500 m chlorophyll-a estimation towards coastal and estuarine monitoring , 2007 .

[14]  F. Muller‐Karger,et al.  Monitoring turbidity in Tampa Bay using MODIS/Aqua 250-m imagery , 2007 .

[15]  R. T. James,et al.  Internal Nutrient Loads from Sediments in a Shallow, Subtropical Lake , 2005 .

[16]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[17]  R. Stumpf,et al.  Calibration of a general optical equation for remote sensing of suspended sediments in a moderately turbid estuary , 1989 .

[18]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[19]  P. Moore,et al.  Phosphorus Flux between Sediment and Overlying Water in Lake Okeechobee, Florida: Spatial and Temporal Variations , 1998 .

[20]  Xiaoling Chen,et al.  Integration of multi-source data for water quality classification in the Pearl River estuary and its adjacent coastal waters of Hong Kong , 2004 .

[21]  Ni-Bin Chang,et al.  Soil moisture estimation in a semiarid watershed using RADARSAT‐1 satellite imagery and genetic programming , 2006 .

[22]  J. V. Turner,et al.  Pharmacokinetic parameter prediction from drug structure using artificial neural networks. , 2004, International journal of pharmaceutics.

[23]  Yann LeCun,et al.  Learning processes in an asymmetric threshold network , 1986 .

[24]  J. Kämäri,et al.  Detection of water quality using simulated satellite data and semi-empirical algorithms in Finland. , 2001, The Science of the total environment.

[25]  Keith W. Hipel Stochastic and statistical methods in hydrology and environmental engineering , 1994 .

[26]  Robert Groth,et al.  Data mining - a hands-on approach for business professionals , 1997 .

[27]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[28]  R. Stephenson A and V , 1962, The British journal of ophthalmology.