A machine learning based methodology for anomaly detection in dam behaviour

Dam behaviour is difficult to predict with high accuracy. Numerical models for structural calculation solve the equations of continuum mechanics, but are subject to considerable uncertainty as to the characterisation of materials, especially with regard to the foundation. As a result, these models are often incapable to calculate dam behaviour with sufficient precision. Thus, it is difficult to determine whether a given deviation between model results and monitoring data represent a relevant anomaly or incipient failure. By contrast, there is a tendency towards automatising dam monitoring devices, which allows for increasing the reading frequency and results in a greater amount and variety of data available, such as displacements, leakage, or interstitial pressure, among others. This increasing volume of dam monitoring data makes it interesting to study the ability of advanced tools to extract useful information from observed variables. In particular, in the field of Machine Learning (ML), powerful algorithms have been developed to face problems where the amount of data is much larger or the underlying phenomena is much less understood. In this thesis, the possibilities of machine learning techniques were analysed for application to dam structural analysis based on monitoring data. The typical characteristics of the data sets available in dam safety were taking into account, as regards their nature, quality and size. A critical literature review was performed, from which the key issues to consider for implementation of these algorithms in dam safety were identified. A comparative study of the accuracy of a set of algorithms for predicting dam behaviour was carried out, considering radial and tangential displacements and leakage flow in a 100m high dam. The results suggested that the algorithm called “Boosted Regression Trees” (BRT) is the most suitable, being more accurate in general, while flexible and relatively easy to implement. At a later stage, the possibilities of interpretation of the mentioned algorithm were evaluated, to identify the shape and intensity of the association between external variables and the dam response, as well as the effect of time. The tools were applied to the same test case, and allowed more accurate identification of the time effect than the traditional statistical method. Finally, a methodology for the implementation of predictive models based on BRT for early detection of anomalies was developed and implemented in an interactive tool that provides information on dam behaviour, through a set of selected devices. It allows the user to easily verify whether the actual data for each of these devices are within a pre-defined normal operation interval.

[1]  A. De Sortis,et al.  Statistical analysis and structural identification in concrete dam monitoring , 2007 .

[2]  Feng Jin,et al.  Practical procedure for predicting non-uniform temperature on the exposed face of arch dams , 2010 .

[3]  Greg Ridgeway,et al.  Generalized Boosted Models: A guide to the gbm package , 2006 .

[4]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[5]  G. Lombardi Structural Safety Assessment of Dams ADVANCED DATA INTERPRETATION FOR DIAGNOSIS OF CONCRETE DAMS , 2005 .

[6]  Luigi Piroddi,et al.  Long-range nonlinear prediction: a case study , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[7]  Joaquín Izquierdo,et al.  Predictive models for forecasting hourly urban water demand , 2010 .

[8]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[9]  J. Mata,et al.  Interpretation of concrete dam behaviour with artificial neural network and multiple linear regression models , 2011 .

[10]  F. Dufour,et al.  Thermal displacements of concrete dams: Accounting for water temperature in statistical models , 2015 .

[11]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[12]  Nenad Grujovic,et al.  Development of support vector regression identification model for prediction of dam structural behaviour , 2014 .

[13]  Ming-Wei Chang,et al.  Load forecasting using support vector Machines: a study on EUNITE competition 2001 , 2004, IEEE Transactions on Power Systems.

[14]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[15]  Javier M. Moguerza,et al.  Support Vector Machines with Applications , 2006, math/0612817.

[16]  Jean-Michel Poggi,et al.  Variable selection using random forests , 2010, Pattern Recognit. Lett..

[17]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[18]  D. J. Vicente,et al.  Treatment of Solar Radiation by Spatial and Temporal Discretization for Modeling the Thermal Response of Arch Dams , 2014 .

[19]  Nenad Grujovic,et al.  Modelling of dam behaviour based on neuro-fuzzy identification , 2012 .

[20]  V. Saouma,et al.  STATISTICAL AND 3 D NONLINEAR FINITE ELEMENT ANALYSIS OF SCHLEGEIS DAM , 2022 .

[21]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics (e1071), TU Wien , 2014 .

[22]  Mohsen Ghaemian,et al.  Effects of Environmental Action on Thermal Stress Analysis of Karaj Concrete Arch Dam , 2006 .

[23]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[24]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[25]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[26]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[27]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[28]  Eugenio Oñate,et al.  An empirical comparison of machine learning techniques for dam behaviour modelling , 2015 .

[29]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..