Modeling Chlorophyll-A in Taihu Lake with Machine Learning Models

This paper studies the relation between chlorophyll-a and 10 environmental factors such as water temperature (T), COD, NH4 + , NO3 - , TN, PO43 + , TP, suspend solids (SS), Secci- depth (SD) and water depth (D) based on the monitoring data of 2005 in Taihu Lake. Three kinds of models are designed using the multiple regression statistical (MRS) method, the back propagation artifical neural network (BP ANN) and the support vector machine (SVM). The model validation shows that the machine learning models, BP ANN model and SVM model, work better than the linear MRS model, and the SVM presents the best performance in terms of root mean square error. The sensitivity analysis indicates that the concentration of chlorophyll-a is very sensitive to the changes of water temperature, water depth, and total nitrogen, but does not show significant changes to phosphorous variables such as total phosphorus and orthophosphate. It implies that algae blooms are more likely decided by physical parameters and accumulated at shallow areas by wind. controlled by limiting, physiological and multiple factors. (3) Time-series analysis model, based on multivariate relationships with limiting and multiple factors, predicts time dependent chlorophyll-a. (4) Heuristic model, based on cross-section data and time-series data, predicts seasonality or serial dependency of phytoplankton composition by combining species assemblages with causal knowledge on limiting, physiological and multiple factors. (5) Fuzzy model, quantify periodically (e.g. monthly) the possible dominance of algal species. (6) Neural network model, driven by time-series data of algal species and control factors, predicts timing and magnitudes of algal species based upon the strength of associations with limiting and multiple control factors. In china, related Taihu Lake, Xu Qiujin established partial differential equations based on algae ecological dynamics (2), considering a number of factors such as water temperature, TN and TP; Chen Yonggen utilized statistical regression method which only attaches importance to the relationships between chlorophyll-a, biomass of algae and TN, TP concentrations (3); Chen Yuwei dealt with the relation between algal biomass and 15 environmental factors using the stepwise multiple regression statistical method (4); Yao Zhihong tried to forecast the growth of blue-green algae using neural network (5), but this type of research has focused on the structure of the neural network and did not make an explanation for algal prediction. Furthermore, given that the design and training of neural networks often result in a complicated, time-consuming task in which many parameters have to be tuned, the support vector machine (SVM) has been introduced as a promising alternative to neural networks and produced good results in pattern recognition (6), classification (7-9) and prediction (10-12). However, little work has been reported towards the application of support vector machine for the prediction of phytoplankton growth.

[1]  Qin Boqiang,et al.  Prediction of Blue-green Algae Bloom Using Stepwise Multiple Regression Between Algae &Related Environmental Factors in Meiliang Bay, Lake Taihu , 2001 .

[2]  Juan Carlos Gutiérrez-Estrada,et al.  Artificial neural network approaches to one-step weekly prediction of Dinophysis acuminata blooms in Huelva (Western Andalucía, Spain) , 2007 .

[3]  Yue-Shi Lee,et al.  Robust and efficient multiclass SVM models for phrase pattern recognition , 2008, Pattern Recognit..

[4]  Maziar Palhang,et al.  Generalization performance of support vector machines and neural networks in runoff modeling , 2009, Expert Syst. Appl..

[5]  F. Recknagel,et al.  Artificial neural network approach for modelling and prediction of algal blooms , 1997 .

[6]  Cheong Hee Park,et al.  A SVM-based discretization method with application to associative classification , 2009, Expert Syst. Appl..

[7]  Li Sheng-peng Research on prediction of phytoplankton's density using support vector machines , 2007 .

[8]  S. Soyupak,et al.  Case studies on the use of neural networks in eutrophication modeling , 2000 .

[9]  P. G. Whitehead,et al.  Modelling algal growth and transport in rivers: a comparison of time series analysis, dynamic mass balance and neural network techniques , 1997, Hydrobiologia.

[10]  Qiang Fu,et al.  Using support vector machine to predict eco-environment burden: a case study of Wuhan, Hubei Province, China. , 2008, Biomedical and environmental sciences : BES.

[11]  Chen Yuwei,et al.  Prediction of Blue-green Algae Bloom Using Stepwise Multiple Regression Between Algae & Related Environmental Factors in Meiliang Bay, Lake Taihu , 2001 .

[12]  Holger R. Maier,et al.  Use of artificial neural networks for modelling cyanobacteria Anabaena spp. in the River Murray, South Australia , 1998 .

[13]  M. Dokulil,et al.  Changes of nutrients and phytoplankton chlorophyll-a in a large shallow lake, Taihu, China: an 8-year investigation , 2003, Hydrobiologia.

[14]  Gustavo Camps-Valls,et al.  Retrieval of oceanic chlorophyll concentration with relevance vector machines , 2006 .

[15]  Lorenzo Bruzzone,et al.  Classification of hyperspectral remote-sensing data with primal SVM for small-sized training dataset problem☆ , 2008 .

[16]  Yan Huang,et al.  Neural network modelling of coastal algal blooms , 2003 .

[17]  S. Durbha,et al.  Support vector machines regression for retrieval of leaf area index from multiangle imaging spectroradiometer , 2007 .

[18]  Björn A. Malmgren,et al.  Application of Artificial Neural Networks (ANN) to Primary Production Time-series Data , 2001 .

[19]  Zhou Liguo,et al.  Relationship between blue algal bloom and water temperature in Lake Taihu based on MODIS , 2008 .

[20]  M. Hosomi,et al.  Novel application of a back-propagation artificial neural network model formulated to predict algal bloom , 1997 .

[21]  Hu Wei-ping Relationships between chlorophyll-a content and TN and TP concentrations in water bodies of Taihu Lake,China , 2007 .

[22]  Du Pei HYPERSPECTRAL REMOTE SENSING IMAGE CLASSIFICATION BASED ON SUPPORT VECTOR MACHINE , 2008 .

[23]  Yao Zhi-hong Improved Genetic Neural Network and Its Application in Forecasting of Rich Nourishment of Water and Blue-Green Algae , 2008 .

[24]  Xu Qiujin Ecological Simulation of Algae Growth in Taihu Lake , 2001 .