Missing data simulation inside flow rate time-series using multiple-point statistics

The direct sampling (DS) multiple-point statistical technique is proposed as a non-parametric missing data simulator for hydrological flow rate time-series. The algorithm makes use of the patterns contained inside a training data set to reproduce the complexity of the missing data. The proposed setup is tested in the reconstruction of a flow rate time-series while considering several missing data scenarios, as well as a comparative test against a time-series model of type ARMAX. The results show that DS generates more realistic simulations than ARMAX, better recovering the statistical content of the missing data. The predictive power of both techniques is much increased when a correlated flow rate time-series is used, but DS can also use incomplete auxiliary time-series, with a comparable prediction power. This makes the technique a handy simulation tool for practitioners dealing with incomplete data sets. A resampling technique is applied to missing flow rate data simulation.The proposed technique generates realistic temporal data patterns.Generally, the statistical content is entirely recovered even in large gaps.The setup can use an auxiliary time-series to condition the simulation.An incomplete auxiliary time-series can be used, with increased prediction power.

[1]  D. Kondrashov,et al.  Reconstruction of gaps in the past history of solar wind parameters , 2014 .

[2]  M. Heimann,et al.  Comprehensive comparison of gap-filling techniques for eddy covariance net carbon fluxes , 2007 .

[3]  Vahid Nourani,et al.  Investigating the Ability of Artificial Neural Network (ANN) Models to Estimate Missing Rain-gauge Data , 2012 .

[4]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[5]  R. M. Srivastava,et al.  Multivariate Geostatistics: Beyond Bivariate Moments , 1993 .

[6]  Saad Bennis,et al.  Improving single-variable and multivariable techniques for estimating missing hydrological data , 1997 .

[7]  F. Oriani Stochastic simulation of rainfall and climate variables using the direct sampling technique , 2015 .

[8]  David H. Schoellhamer,et al.  Singular spectrum analysis for time series with missing data , 2001 .

[9]  Jamshid Piri,et al.  Application of ANN and ANFIS models for reconstructing missing flow data , 2010, Environmental monitoring and assessment.

[10]  Niklas Linde,et al.  Feature-preserving interpolation and filtering of environmental time series , 2015, Environ. Model. Softw..

[11]  Gift Dumedah,et al.  Assessing artificial neural networks and statistical methods for infilling missing soil moisture records , 2014 .

[12]  K. Abbaspour,et al.  A comparison between artificial neural network method and nonlinear regression method to estimate the missing hydrometric data , 2011 .

[13]  D. Legates,et al.  Evaluating the use of “goodness‐of‐fit” Measures in hydrologic and hydroclimatic model validation , 1999 .

[14]  Quan J. Wang,et al.  A Bayesian method for multi-site stochastic data generation: Dealing with non-concurrent and missing data, variable transformation and parameter uncertainty , 2008, Environ. Model. Softw..

[15]  T. A. Buishand,et al.  Simulation of 6-hourly rainfall and temperature by two resampling schemes , 2003 .

[16]  Philippe Renard,et al.  Spatiotemporal reconstruction of gaps in multivariate fields using the direct sampling approach , 2012 .

[17]  Gregoire Mariethoz,et al.  The Direct Sampling method to perform multiple‐point geostatistical simulations , 2010 .

[18]  Ü. Rannik,et al.  Gap filling strategies for defensible annual sums of net ecosystem exchange , 2001 .

[19]  Denis Allard,et al.  Conditional Simulation of Multi-Type Non Stationary Markov Object Models Respecting Specified Proportions , 2006 .

[20]  I. Bamberger,et al.  Gap-filling strategies for annual VOC flux data sets. , 2013, Biogeosciences discussions :.

[21]  Philippe Renard,et al.  Simulation of rainfall time series from different climatic regions using the direct sampling technique , 2014 .

[22]  Theo Brandsma,et al.  Multisite simulation of daily precipitation and temperature in the Rhine Basin by nearest‐neighbor resampling , 2001 .

[23]  Shouhong Wang,et al.  Application of self-organising maps for data mining with incomplete data sets , 2003, Neural Computing & Applications.

[24]  Philippe Renard,et al.  A practical guide to performing multiple-point statistical simulations with the Direct Sampling algorithm , 2013, Comput. Geosci..

[25]  John O. Odiyo,et al.  Filling of missing rainfall data in Luvuvhu River Catchment using artificial neural networks , 2011 .

[26]  Maria J. Diamantopoulou Filling gaps in diameter measurements on standing tree boles in the urban forest of Thessaloniki, Greece , 2010, Environ. Model. Softw..

[27]  Louis Wehenkel,et al.  Data validation and missing data reconstruction using self-organizing map for water treatment , 2011, Neural Computing and Applications.

[28]  Balaji Rajagopalan,et al.  A resampling procedure for generating conditioned daily weather sequences , 2004 .

[29]  Sebastien Strebelle,et al.  Conditional Simulation of Complex Geological Structures Using Multiple-Point Statistics , 2002 .

[30]  G. Mariéthoz,et al.  An Improved Parallel Multiple-point Algorithm Using a List Approach , 2011 .

[31]  Sam Ameri,et al.  Application of artificial neural networks for reservoir characterization with limited data , 2005 .

[32]  Upmanu Lall,et al.  A k‐nearest‐neighbor simulator for daily precipitation and other weather variables , 1999 .