Classification of algal bloom species from remote sensing data using an extreme gradient boosted decision tree model

ABSTRACT Coastal and open ocean regions throughout the world are now subject to an array of toxic, harmful, or more intense algal blooms with an increasing trend of incidence over large geographical areas due to anthropogenic factors such as pollution and climate shifts. To date, detection capabilities of causative species based on remote sensing data are greatly limited because of the difficulties in interpreting the composite reflectance signal from different water features and types. In the present study, an accurate and reliable method is developed to automatically detect the onset of blooms and correctly classify the bloom species in Arabian Sea and Bay of Bengal waters using remote sensing data. A data-driven approach using machine learning algorithm is devised based on reflectance spectral signatures and tested on several MODIS-Aqua (Moderate Resolution Imaging Spectroradiometer) data for classifying the dominant water categories, including clear ocean waters devoid of sediments and algal blooms, sediment-laden coastal waters, and three major algal blooms, Trichodesmium erythraeum, Noctiluca scintillans and Cochlodinium polykrikoides. An extreme gradient boosted decision tree (XGBoost) model is chosen to improve the prediction accuracy by prevention of overfitting, which increases the scalability of the model on several unseen test data. This model was trained using 1.5 million samples and resulted in a classification accuracy of over 98%. When the results were validated using forty thousand random samples from the known blooms, an overall accuracy more than 96.8% was achieved. The applicability of the trained XGBoost model was further verified using MODIS-Aqua images, and it showed promise for successful detection and identification of well-documented blooms. The use of spectral information to classify algal blooms makes this method more robust and easily adaptable to different ocean colour sensors with a scope to accommodate other major algal blooms.

[1]  P. Shanmugam,et al.  A modern robust approach to remotely estimate chlorophyll in coastal and inland zones , 2018 .

[2]  Qingquan Li,et al.  Comparison of Machine Learning Techniques in Inferring Phytoplankton Size Classes , 2018, Remote. Sens..

[3]  Long Chen,et al.  Short-Term Load Forecasting Using EMD-LSTM Neural Networks with a Xgboost Algorithm for Feature Importance Evaluation , 2017 .

[4]  Yan Bai,et al.  A semianalytical MERIS green‐red band algorithm for identifying phytoplankton bloom types in the East China Sea , 2017 .

[5]  P. Shanmugam,et al.  An optical tool for quantitative assessment of phycocyanin pigment concentration in cyanobacterial blooms within inland and marine environments , 2017 .

[6]  Elamurugu Alias Gokul,et al.  An optical system for detecting and describing major algal blooms in coastal and oceanic waters around India , 2016 .

[7]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[8]  Palanisamy Shanmugam,et al.  A Multidisciplinary Remote Sensing Ocean Color Sensor: Analysis of User Needs and Recommendations for Future Developments , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[9]  Chuanmin Hu,et al.  Spectral and spatial requirements of remote measurements of pelagic Sargassum macroalgae , 2015 .

[10]  Jun Zhao,et al.  Characterization of harmful algal blooms (HABs) in the Arabian Gulf and the Sea of Oman using MERIS fluorescence data , 2015 .

[11]  Zhongfeng Qiu,et al.  Estimating phycocyanin pigment concentration in productive inland waters using Landsat measurements: a case study in Lake Dianchi. , 2015, Optics express.

[12]  Lin Qi,et al.  A Harmful Algal Bloom of Karenia brevis in the Northeastern Gulf of Mexico as Revealed by MODIS and VIIRS: A Comparison , 2015, Sensors.

[13]  Jianying Zhang,et al.  Current Techniques for Detecting and Monitoring Algal Toxins and Causative Harmful Algal Blooms , 2014 .

[14]  P. Shanmugam,et al.  A robust method for removal of glint effects from satellite ocean colour imagery , 2014 .

[15]  Sachidananda Mishra,et al.  A novel remote sensing algorithm to quantify phycocyanin in cyanobacterial algal blooms , 2014 .

[16]  H. Ghedira,et al.  An overview of historical harmful algae blooms outbreaks in the Arabian Seas. , 2014, Marine pollution bulletin.

[17]  Sushma G. Parab,et al.  Massive outbreaks of Noctiluca scintillans blooms in the Arabian Sea due to spread of hypoxia , 2014, Nature Communications.

[18]  Astrid Bracher,et al.  Phytoplankton functional types from Space. , 2014 .

[19]  P. Shanmugam,et al.  Corrigendum to ``A novel method for estimation of aerosol radiance and its extrapolation in the atmospheric correction of satellite data over optically complex oceanic waters'' [Remote Sensing of Environment 142 (2014) 188-206] , 2014 .

[20]  Stuart R. Phinn,et al.  ESA-MERIS 10-Year Mission Reveals Contrasting Phytoplankton Bloom Dynamics in Two Tropical Regions of Northern Australia , 2014, Remote. Sens..

[21]  P. I. Miller,et al.  Erratum to “Satellite discrimination of Karenia mikimotoi and Phaeocystis harmful algal blooms in European coastal waters: Merged classification of ocean colour data” [Harmful Algae 31 (2014) 163–176] , 2014 .

[22]  Joaquim I. Goes,et al.  Mesoscale and Nutrient Conditions Associated with the Massive 2008 Cochlodinium polykrikoides Bloom in the Sea of Oman/Arabian Gulf , 2014, Estuaries and Coasts.

[23]  Palanisamy Shanmugam,et al.  A novel method for estimation of aerosol radiance and its extrapolation in the atmospheric correction of satellite data over optically complex oceanic waters , 2014 .

[24]  Young-Tae Park,et al.  Monitoring and trends in harmful algal blooms and red tides in Korean coastal waters, with emphasis on Cochlodinium polykrikoides , 2013 .

[25]  Igor Ogashawara,et al.  A Performance Review of Reflectance Based Algorithms for Predicting Phycocyanin Concentrations in Inland Waters , 2013, Remote. Sens..

[26]  Palanisamy Shanmugam,et al.  OSABT: An Innovative Algorithm to Detect and Characterize Ocean Surface Algal Blooms , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[27]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[28]  P. Shanmugam,et al.  An Algorithm for Classification of Algal Blooms Using MODIS-Aqua Data in Oceanic Waters around India , 2012 .

[29]  B. Franz,et al.  Detection of coccolithophore blooms in ocean color satellite imagery: A generalized approach for use with multiple sensors , 2012 .

[30]  C. Gobler,et al.  The rise of harmful cyanobacteria blooms: The potential roles of eutrophication and climate change , 2012 .

[31]  S. Piontkovski,et al.  The occurrence of algal blooms in Omani coastal waters , 2012 .

[32]  Kamrani Ehsan,et al.  Catastrophic Impact of Red Tides of Cochlodinium polykrikoides on the Razor Clam Solen dactylus in Coastal Waters of the Northern Persian Gulf , 2011 .

[33]  Cara Wilson,et al.  The rocky road from research to operations for satellite ocean-colour data in fishery management , 2011 .

[34]  Peter V. Ridd,et al.  A simple, binary classification algorithm for the detection of Trichodesmium spp. within the Great Barrier Reef using MODIS imagery , 2011 .

[35]  V. N. Sanjeevan,et al.  Blooms of Trichodesmium erythraeum in the South Eastern Arabian Sea during the onset of 2009 summer monsoon , 2010 .

[36]  Jennifer P. Cannizzaro,et al.  Remote Detection of Trichodesmium Blooms in Optically Complex Coastal Waters: Examples with Modis Full-Spectral Data , 2010 .

[37]  A. Mohanty,et al.  Bloom of Trichodesmium erythraeum (Ehr.) and its impact on water quality and plankton community structure in the coastal waters of southeast coast of India , 2010 .

[38]  Donald M. Anderson,et al.  The catastrophic 2008-2009 red tide in the Arabian Gulf region, with observations on the identification and phylogeny of the fish-killing dinoflagellate Cochlodinium polykrikoides. , 2010 .

[39]  Chuanmin Hu A novel ocean color index to detect floating algae in the global oceans , 2009 .

[40]  D. Anderson,et al.  Approaches to monitoring, control and management of harmful algal blooms (HABs). , 2009, Ocean & coastal management.

[41]  Jinhui Wang,et al.  Occurrence and potential risks of harmful algal blooms in the East China Sea. , 2009, The Science of the total environment.

[42]  C. Gobler,et al.  Understanding Causes and Impacts of the Dinoflagellate, Cochlodinium polykrikoides, Blooms in the Chesapeake Bay , 2009 .

[43]  J. Siddorn,et al.  How well can we forecast high biomass algal bloom events in a eutrophic coastal sea , 2008 .

[44]  J. Gower,et al.  Global monitoring of plankton blooms using MERIS MCI , 2008 .

[45]  Joaquim I. Goes,et al.  Blooms of Noctiluca miliaris in the Arabian Sea - An In Situ and Satellite Study , 2008 .

[46]  C. Gobler,et al.  Characterization, dynamics, and ecological impacts of harmful Cochlodinium polykrikoides blooms on eastern Long Island, NY, USA , 2008 .

[47]  B. Franz,et al.  Sensor-independent approach to the vicarious calibration of satellite ocean color radiometry. , 2007, Applied optics.

[48]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[49]  Y. Ahn,et al.  Detecting the red tide algal blooms from satellite ocean color observations in optically complex Northeast-Asia Coastal waters , 2006 .

[50]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[51]  J. Randerson,et al.  Primary production of the biosphere: integrating terrestrial and oceanic components , 1998, Science.

[52]  T. Platt,et al.  Detection of phytoplankton pigments from ocean color: improved algorithms. , 1994, Applied optics.

[53]  G. Hallegraeff A review of harmful algal blooms and their apparent global increase , 1993 .

[54]  J. H. Schuenemeyer,et al.  Generalized Linear Models (2nd ed.) , 1992 .

[55]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[56]  L. Prieur,et al.  A three-component model of ocean colour and its application to remote sensing of phytoplankton pigments in coastal waters , 1989 .

[57]  R. A. Neville,et al.  Passive remote sensing of phytoplankton via chlorophyll α fluorescence , 1977 .

[58]  L. Pomeroy The Ocean's Food Web, A Changing Paradigm , 1974 .

[59]  K. P. Soman,et al.  MODIS-Aqua Data Based Detection and Classification of Algal Blooms along the Coast of India Using RLS Classifier☆ , 2016 .

[60]  P. I. Miller,et al.  Satellite discrimination of Karenia mikimotoi and Phaeocystis harmful algal blooms in European coastal waters: Merged classification of ocean colour data. , 2014, Harmful algae.

[61]  Bryan A. Franz,et al.  Chlorophyll aalgorithms for oligotrophic oceans: A novel approach based on three‐band reflectance difference , 2012 .

[62]  P. Shaiju,et al.  Red tide of Noctiluca miliaris off south of Thiruvananthapuram subsequent to the ‘stench event’ at the southern Kerala coast , 2005 .

[63]  Vipin Kumar,et al.  The Challenges of Clustering High Dimensional Data , 2004 .

[64]  E. Carpenter,et al.  Detecting Trichodesmium blooms in SeaWiFS imagery , 2001 .

[65]  Soo Chin Liew,et al.  CLASSIFICATION OF ALGAL BLOOM TYPES FROM REMOTE SENSING REFLECTANCE , 2000 .