A comparative study of multiple linear regression, artificial neural network and support vector machine for the prediction of dissolved oxygen

Dissolved oxygen (DO) is an important indicator reflecting the healthy state of aquatic ecosystems. The balance between oxygen supply and consuming in the water body is significantly influenced by physical and chemical parameters. This study aimed to evaluate and compare the performance of multiple linear regression (MLR), back propagation neural network (BPNN), and support vector machine (SVM) for the prediction of DO concentration based on multiple water quality parameters. The data set included 969 samples collected from rivers in China and the 16 predicted variables involved physical factors, nutrients, organic substances, and metal ions, which would affect the DO concentrations directly or indirectly by influencing the water–air exchange, the growth of water plants, and the lives of aquatic animals. The models optimized by particle swarm optimization (PSO) algorithm were calibrated and tested, with nearly 80% and 20% data, respectively. The results showed that the PSO-BPNN and PSO-SVM had better predicted performances than linear regression methods. All of the evaluated criteria, including coefficient of determination, mean squared error, and absolute relative errors suggested that the PSO-SVM model was superior to the MLR and PSO-BPNN for DO prediction in the rivers of China with limited knowledge of other information.

[1]  N. A. Diamantidis,et al.  Unsupervised stratification of cross-validation for accuracy estimation , 2000, Artif. Intell..

[2]  S. D. Cooper,et al.  Relationships among catchment land use and concentrations of nutrients, algae, and dissolved oxygen in a southern California river , 2012, Freshwater Science.

[3]  M. Bonansea,et al.  Monitoring of regional lake water clarity using Landsat imagery , 2015 .

[4]  Steven J. Cooke,et al.  A moving target—incorporating knowledge of the spatial ecology of fish into the assessment and management of freshwater fish populations , 2016, Environmental Monitoring and Assessment.

[5]  A. T. C. Goh,et al.  Back-propagation neural networks for modeling complex systems , 1995, Artif. Intell. Eng..

[6]  Chuanqi Zhang,et al.  Artificial neural network modeling of dissolved oxygen in the Heihe River, Northwestern China , 2013, Environmental Monitoring and Assessment.

[7]  Kurt Hornik,et al.  The support vector machine under test , 2003, Neurocomputing.

[8]  Nikolaos Voulvoulis,et al.  Implementing the Water Framework Directive: a transition from established monitoring networks in England and Wales , 2012 .

[9]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[10]  Lutz Prechelt,et al.  Automatic early stopping using cross validation: quantifying the criteria , 1998, Neural Networks.

[11]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[12]  D. Bui,et al.  A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. , 2015 .

[13]  Najmeh Mahjouri,et al.  Developing a fuzzy neural network-based support vector regression (FNN-SVR) for regionalizing nitrate concentration in groundwater , 2014, Environmental Monitoring and Assessment.

[14]  Michael R. Lyu,et al.  A hybrid particle swarm optimization-back-propagation algorithm for feedforward neural network training , 2007, Appl. Math. Comput..

[15]  Runsen Zhang,et al.  Landscape ecological security response to land use change in the tidal flat reclamation zone, China , 2015, Environmental Monitoring and Assessment.

[16]  Shih-Wei Lin,et al.  Particle swarm optimization for parameter determination and feature selection of support vector machines , 2008, Expert Syst. Appl..

[17]  Jan-Tai Kuo,et al.  USING ARTIFICIAL NEURAL NETWORK FOR RESERVOIR EUTROPHICATION PREDICTION , 2007 .

[18]  Alan R. Hill,et al.  Groundwater phosphate dynamics in a river riparian zone: effects of hydrologic flowpaths, lithology and redox chemistry , 2001 .

[19]  Seockheon Lee,et al.  Application of Water Quality Indices and Dissolved Oxygen as Indicators for River Water Classification and Urban Impact Assessment , 2007, Environmental monitoring and assessment.

[20]  X. Wen,et al.  A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region , 2014 .

[21]  Yan-liang Du,et al.  Phytoplankton dynamics and their relationship with environmental variables of Lake Poyang , 2016 .

[22]  Sovan Lek,et al.  Artificial neural networks as a tool in ecological modelling, an introduction , 1999 .

[23]  Surjya K. Pal,et al.  Modeling of electrical discharge machining process using back propagation neural network and multi-objective optimization using non-dominating sorting genetic algorithm-II , 2007 .

[24]  E. Maurer,et al.  Effects of climate change on stream temperature, dissolved oxygen, and sediment concentration in the Sierra Nevada in California , 2013 .

[25]  B. A. Cox,et al.  A review of currently available in-stream water-quality models and their applicability for simulating dissolved oxygen in lowland rivers , 2005 .

[26]  H. Stefan,et al.  Dissolved oxygen model for regional lake analysis , 1994 .

[27]  Daoliang Li,et al.  Prediction of dissolved oxygen content in river crab culture based on least squares support vector regression optimized by improved particle swarm optimization , 2013 .

[28]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[30]  Kwok-wing Chau,et al.  Particle Swarm Optimization Training Algorithm for ANNs in Stage Prediction of Shing Mun River , 2006 .

[31]  Yoshua Bengio,et al.  No Unbiased Estimator of the Variance of K-Fold Cross-Validation , 2003, J. Mach. Learn. Res..

[32]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[33]  Young-Chan Lee,et al.  Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters , 2005, Expert Syst. Appl..

[34]  J. Jones,et al.  Factors controlling the temporal variability in dissolved oxygen regime of salmon spawning gravels , 2014 .

[35]  Gavin C. Cawley,et al.  Fast exact leave-one-out cross-validation of sparse least-squares support vector machines , 2004, Neural Networks.

[37]  Shahab Araghinejad,et al.  A Comparative Assessment of Support Vector Machines, Probabilistic Neural Networks, and K-Nearest Neighbor Algorithms for Water Quality Classification , 2014, Water Resources Management.

[38]  Davut Hanbay,et al.  Application of least square support vector machines in the prediction of aeration performance of plunging overfall jets from weirs , 2009, Expert Syst. Appl..

[39]  Mingjun Wang,et al.  Particle swarm optimization-based support vector machine for forecasting dissolved gases content in power transformer oil , 2009 .

[40]  Jaco Kemp,et al.  Spatiotemporal analysis of encroachment on wetlands: a case of Nakivubo wetland in Kampala, Uganda , 2016, Environmental Monitoring and Assessment.

[41]  L. Surinaidu Role of hydrogeochemical process in increasing groundwater salinity in the central Godavari delta , 2015 .

[42]  B. Pradhan,et al.  Regional landslide susceptibility analysis using back-propagation neural network model at Cameron Highland, Malaysia , 2010 .