Statistical model optimized random forest regression model for concrete dam deformation monitoring

The unique structures and foundations of a dam make its safety monitoring a complex task. As the most intuitive effect of dams, deformation contains important information on dam evolution. Actual response has the purpose of diagnosis and early warning compared with model prediction. Given the poor generalization ability of the conventional statistical model, establishing a dam deformation monitoring model is thus essential. The prediction of concrete dam deformation using statistical model and random forest regression (RFR) model is studied. To build an optimized RFR model, the statistical model is used to establish input variables, select the appropriate parameters Mtry and Ntree according to out‐of‐bag error, and extract strong explanatory variables. The model's advantage is that the influence factors can describe concrete dam deformation, and RF can serve as a sensible new data mining tool. The importance of variables for deformation prediction is measured by RF. The RFR method can extract representative influencing factors based on variable importance. The methods are applied to an actual concrete dam. Results indicate that the RFR model can be applied for analysis and prediction of other structural behavior.

[1]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[2]  David H. Wolpert,et al.  An Efficient Method To Estimate Bagging's Generalization Error , 1999, Machine Learning.

[3]  F. Dufour,et al.  Thermal displacements of concrete dams: Accounting for water temperature in statistical models , 2015 .

[4]  S. S. Matin,et al.  Explaining relationships between coke quality index and coal properties by Random Forest method , 2016 .

[5]  Carolin Strobl,et al.  A new variable importance measure for random forests with missing data , 2012, Statistics and Computing.

[6]  Antanas Verikas,et al.  Mining data with random forests: A survey and results of new tests , 2011, Pattern Recognit..

[7]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[8]  Jean-Michel Poggi,et al.  Variable selection using random forests , 2010, Pattern Recognit. Lett..

[9]  Huaizhi Su,et al.  Performance improvement method of support vector machine‐based model monitoring dam safety , 2016 .

[10]  J. Mata,et al.  Interpretation of concrete dam behaviour with artificial neural network and multiple linear regression models , 2011 .

[11]  Eugenio Oñate,et al.  Interpretation of dam deformation and leakage with boosted regression trees , 2016 .

[12]  Alexander Hapfelmeier,et al.  A new variable selection approach using Random Forests , 2013, Comput. Stat. Data Anal..

[13]  Peng Yan,et al.  Ill-conditioned problems of dam safety monitoring models and their processing methods , 2011 .

[14]  Eugenio Oñate,et al.  An empirical comparison of machine learning techniques for dam behaviour modelling , 2015 .

[15]  José Sá da Costa,et al.  Constructing statistical models for arch dam deformation , 2014 .

[16]  David C. Hoaglin,et al.  Some Implementations of the Boxplot , 1989 .