Combining statistical machine learning models with ARIMA for water level forecasting: The case of the Red river

Abstract Forecasting water level is an extremely important task as it allows to mitigate the effects of floods, reduce and prevent disasters. Physically based models often give good results but they require expensive computational time and various types of hydro-geomorphological data to develop the forecasting system. Alternatively, data driven forecasting models are usually faster and easier to build. During the past decades, statistical machine learning (ML) methods have greatly contributed to the advancement of data driven forecasting systems that provide cost-effective solutions and better performance. Meanwhile, Autoregressive integrated moving average (ARIMA) is one of the famous linear statistical models for time series forecasting. In this paper, we propose a hybrid approach that takes advantages of linear and nonlinear models. The proposed method combines statistical machine learning algorithms and ARIMA for forecasting water level. Observed water level of the Red river at the Vu Quang, Hanoi (3 hourly sampled from 2008 to 2017) and Hung Yen hydrological stations (hourly collected data from 2008 to 4/2015) are used to evaluate the performance of different methods. Experimental results on these 3 real big datasets show the effectiveness of our proposed hybrid models.

[1]  Durdu Ömer Faruk A hybrid neural network and ARIMA model for water quality time series prediction , 2010, Eng. Appl. Artif. Intell..

[2]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[3]  J. Nash,et al.  River flow forecasting through conceptual models part I — A discussion of principles☆ , 1970 .

[4]  Kai Zhang,et al.  Forecasting with prediction intervals for periodic autoregressive moving average models , 2013, Journal of time series analysis.

[5]  A Study on the Efficiency of Hybrid Models in Forecasting Precipitations and Water Inflow Albania Case Study , 2019, Advances in Science, Technology and Engineering Systems Journal.

[6]  Mohammad Rezaie-Balf,et al.  Daily river flow forecasting using ensemble empirical mode decomposition based heuristic regression models: Application on the perennial rivers in Iran and South Korea , 2019, Journal of Hydrology.

[7]  Yousry Mahmoud Ghazaw,et al.  Runoff forecasting by artificial neural network and conventional model , 2011 .

[8]  MohammadSajjad Khan,et al.  Application of Support Vector Machine in Lake Water Level Prediction , 2006 .

[9]  Z. Rzepecka,et al.  Analysis of water budget prediction accuracy using ARIMA models , 2018 .

[10]  Tian Peng,et al.  Streamflow Forecasting Using Empirical Wavelet Transform and Artificial Neural Networks , 2017 .

[11]  M. Mirzavand,et al.  A Stochastic Modelling Technique for Groundwater Level Forecasting in an Arid Environment Using Time Series Methods , 2015, Water Resources Management.

[12]  O. Kisi,et al.  Stream Flow Forecasting of Poorly Gauged Mountainous Watershed by Least Square Support Vector Machine, Fuzzy Genetic Algorithm and M5 Model Tree Using Climatic Data from Nearby Station , 2018, Water Resources Management.

[13]  André Bigand,et al.  Dynamic time warping-based imputation for univariate time series data , 2017, Pattern Recognit. Lett..

[14]  Mohammad Ebrahim Banihabib,et al.  An ARIMA-NARX hybrid model for forecasting urban water consumption (case study: Tehran metropolis) , 2019, Urban Water Journal.

[15]  Jungwook Kim,et al.  Development of Water Level Prediction Models Using Machine Learning in Wetlands: A Case Study of Upo Wetland in South Korea , 2019 .

[16]  Zhihong Zou,et al.  Application of a Hybrid ARIMA and Neural Network Model to Water Quality Time Series Forecasting , 2013 .

[17]  Bhola Ns Ghimire Application of ARIMA Model for River Discharges Analysis , 2017 .

[18]  Jianzhong Zhou,et al.  Flood Forecasting Based on an Improved Extreme Learning Machine Model Combined with the Backtracking Search Optimization Algorithm , 2018, Water.

[19]  N. Raghuwanshi,et al.  Flood inundation modeling using MIKE FLOOD and remote sensing data , 2009 .

[20]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[21]  George E. P. Box,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[22]  Zhonglian Jiang,et al.  ARIMA modelling and forecasting of water level in the middle reach of the Yangtze River , 2017, 2017 4th International Conference on Transportation Information and Safety (ICTIS).

[23]  Giha Lee,et al.  Application of Long Short-Term Memory (LSTM) Neural Network for Flood Forecasting , 2019, Water.

[24]  Yuansheng Lou,et al.  Hydrological Time Series Prediction by ARIMA-SVR Combined Model based on Wavelet Transform , 2019, ICIAI 2019.

[25]  Bahrudin Hrnjica,et al.  Lake Level Prediction using Feed Forward and Recurrent Neural Networks , 2019, Water Resources Management.

[26]  David G. Tarboton,et al.  An overview of current applications, challenges, and future trends in distributed process-based models in hydrology , 2016 .

[27]  A. Shoukry,et al.  Improving the prediction accuracy of river inflow using two data pre-processing techniques coupled with data-driven model , 2019, PeerJ.

[28]  T. M. Bafitlhile,et al.  Applicability of ε-Support Vector Machine and Artificial Neural Network for Flood Forecasting in Humid, Semi-Humid and Semi-Arid Basins in China , 2019, Water.

[29]  Aini Hussain,et al.  Erratum to: Daily Forecasting of Dam Water Levels: Comparing a Support Vector Machine (SVM) Model With Adaptive Neuro Fuzzy Inference System (ANFIS) , 2013, Water Resources Management.

[30]  Souad Riad,et al.  Rainfall-runoff model usingan artificial neural network approach , 2004, Math. Comput. Model..

[31]  K. P. Sudheer,et al.  Potential application of wavelet neural network ensemble to forecast streamflow for flood management , 2016 .

[32]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[33]  Kitsuchart Pasupa,et al.  Water levels forecast in Thailand: A case study of Chao Phraya river , 2016, 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV).

[35]  Bo Zhang,et al.  Short-term prediction of groundwater level using improved random forest regression with a combination of random features , 2018, Applied Water Science.

[36]  C. L. Wu,et al.  Methods to improve neural network performance in daily flows prediction , 2009 .

[37]  Héctor Pomares,et al.  Hybridization of intelligent techniques and ARIMA models for time series prediction , 2008, Fuzzy Sets Syst..

[38]  Alan C. Elliott,et al.  Applied Time Series Analysis with R , 2016 .

[39]  et.al Rati Wongsathan A Hybrid ARIMA and RBF Neural Network Model for Tourist Quantity Forecasting : A Case Study for Chiangmai Province , 2016 .

[40]  Xiumin Chu,et al.  Water level forecasting using a hybrid algorithm of artificial neural networks and local Kalman filtering , 2019 .

[41]  Guoqiang Peter Zhang,et al.  Time series forecasting using a hybrid ARIMA and neural network model , 2003, Neurocomputing.

[42]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[43]  R. Shibata Selection of the order of an autoregressive model by Akaike's information criterion , 1976 .

[44]  A. K. Lohani,et al.  Rainfall Runoff Modeling Using MIKE 11 Nam Model , 2019, Current World Environment.

[45]  Ayşe Soy Temür,et al.  PREDICTING HOUSING SALES IN TURKEY USING ARIMA, LSTM AND HYBRID MODELS , 2019, Journal of Business Economics and Management.

[46]  Van-Nam Huynh,et al.  A Novel Hybridization of ARIMA, ANN, and K-Means for Time Series Forecasting , 2017, Int. J. Knowl. Syst. Sci..

[47]  Il Won Seo,et al.  Artificial Neural Network ensemble modeling with conjunctive data clustering for water quality prediction in rivers , 2015 .

[48]  Duong Tran Anh,et al.  Simulating Future Flows and Salinity Intrusion Using Combined One- and Two-Dimensional Hydrodynamic Modelling—The Case of Hau River, Vietnamese Mekong Delta , 2018, Water.

[49]  Florentino Fernández Riverola,et al.  A hybrid artificial intelligence model for river flow forecasting , 2013, Appl. Soft Comput..

[50]  Mark Junjie Li,et al.  Forecasting Time Series Water Levels on Mekong River Using Machine Learning Models , 2015, 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE).

[51]  Ken Aho,et al.  Model selection for ecologists: the worldviews of AIC and BIC. , 2014, Ecology.

[52]  Sungwon Kim,et al.  Daily water level forecasting using wavelet decomposition and artificial intelligence techniques , 2015 .

[53]  M. Valipour,et al.  Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir , 2013 .

[54]  Ping Ping,et al.  A Water Level Prediction Model Based on ARIMA-RNN , 2019, 2019 IEEE Fifth International Conference on Big Data Computing Service and Applications (BigDataService).

[55]  Mehdi Khashei,et al.  An artificial neural network (p, d, q) model for timeseries forecasting , 2010, Expert Syst. Appl..

[56]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[57]  Kuan-Yu Chen,et al.  A hybrid SARIMA and support vector machines in forecasting the production values of the machinery industry in Taiwan , 2007, Expert Syst. Appl..

[58]  Zaher Mundher Yaseen,et al.  Application of the Hybrid Artificial Neural Network Coupled with Rolling Mechanism and Grey Model Algorithms for Streamflow Forecasting Over Multiple Time Horizons , 2018, Water Resources Management.

[59]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[60]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[61]  Miki Haseyama,et al.  An ARMA order selection method with fuzzy reasoning , 2001, Signal Process..

[62]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[63]  Ching-Hsue Cheng,et al.  A Time-Series Water Level Forecasting Model Based on Imputation and Variable Selection Method , 2017, Comput. Intell. Neurosci..

[64]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[65]  F. Smedt,et al.  A Combined Hydrological and Hydraulic Model for Flood Prediction in Vietnam Applied to the Huong River Basin as a Test Case Study , 2017 .

[66]  Xiaochao Wang,et al.  A Four-Stage Hybrid Model for Hydrological Time Series Forecasting , 2014, PloS one.

[67]  X. Chu,et al.  A hybrid model for water level forecasting: A case study of Wuhan station , 2017, International Conference on Transportation Information and Safety.

[68]  Fengxian Yan,et al.  Reduction of Waste Water in Erhai Lake Based on MIKE21 Hydrodynamic and Water Quality Model , 2013, TheScientificWorldJournal.