Stacking ensemble with parsimonious base models to improve generalization capability in the characterization of steel bolted components

Abstract This study presents a new soft computing method to create an accurate and reliable model capable of determining three key points of the comprehensive force–displacement curve of bolted components in steel structures. To this end, a database with the results of a set of finite element (FE) simulations, which represent real responses of bolted components, is utilized to create a stacking ensemble model that combines the predictions of different parsimonious base models. The innovative proposal of this study is using GA-PARSIMONY, a previously published GA-method which searches parsimonious models by optimizing feature selection and hyperparameter optimization processes. Therefore, parsimonious solutions created with a variety of machine learning methods are combined by means of a nested cross-validation scheme in a unique meta-learner in order to increase diversity and minimize the generalization error rate. The results reveal that efficiently combining parsimonious models provides more accurate and reliable predictions as compared to other methods. Thus, the informational model is able to replace costly FE simulations without significantly comprising accuracy and could be implemented in structural analysis software.

[1]  M. D. McKay,et al.  A comparison of three methods for selecting values of input variables in the analysis of output from a computer code , 2000 .

[2]  Rubén Urraca,et al.  Improving Hotel Room Demand Forecasting with a Hybrid GA-SVR Methodology Based on Skewed Data Transformation, Feature Selection and Parsimony Tuning , 2015, HAIS.

[3]  Andrés Sanz-García,et al.  GA-PARSIMONY: A GA-SVR approach with feature selection and parameter optimization to obtain parsimonious solutions for predicting temperature settings in a continuous annealing furnace , 2015, Appl. Soft Comput..

[4]  Cheng-Lung Huang,et al.  A distributed PSO-SVM hybrid system with feature selection and parameter optimization , 2008, Appl. Soft Comput..

[5]  Ruisheng Zhang,et al.  A BPSO-SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection , 2017, Appl. Soft Comput..

[6]  K. M. Abdalla Review and Classification of Semi-Rigid Connections , 2000 .

[7]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[8]  Luís Simões da Silva,et al.  CHARACTERIZATION OF THE NONLINEAR BEHAVIOUR OF SINGLE BOLTED T-STUB CONNECTIONS , 2004 .

[9]  Hui Huang,et al.  Toward an optimal kernel extreme learning machine using a chaotic moth-flame optimization strategy with applications in medical diagnoses , 2017, Neurocomputing.

[10]  Zbigniew Michalewicz,et al.  Handling Constraints in Genetic Algorithms , 1991, ICGA.

[11]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[12]  K. V. Prema,et al.  Generalization Capability of Artificial Neural Network Incorporated with Pruning Method , 2011, ADCONS.

[13]  Abdelkader Benyettou,et al.  Gray Wolf Optimizer for hyperspectral band selection , 2016, Appl. Soft Comput..

[14]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[15]  Julio Fern'andez-Ceniceros,et al.  A numerical-informational approach for characterising the ductile behaviour of the T-stub component. Part 2: Parsimonious soft-computing-based metamodel , 2015 .

[16]  Sung-Bae Cho,et al.  Hybrid Artificial Intelligent Systems , 2015, Lecture Notes in Computer Science.

[17]  O. Querin,et al.  Review on the modelling of joint behaviour in steel frames , 2011 .

[18]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[19]  Andrés Sanz-García,et al.  Parsimonious Support Vector Machines Modelling for Set Points in Industrial Processes Based on Genetic Algorithm Optimization , 2013, SOCO-CISIS-ICEUTE.

[20]  V. Sadasivam,et al.  An integrated PSO for parameter determination and feature selection of ELM and its application in classification of power system disturbances , 2015, Appl. Soft Comput..

[21]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[22]  Sun-Young Lee,et al.  Improving generalization capability of neural networks based on simulated annealing , 2007, 2007 IEEE Congress on Evolutionary Computation.

[23]  Yong Xia,et al.  A tribe competition-based genetic algorithm for feature selection in pattern classification , 2017, Appl. Soft Comput..

[24]  Zhiwei Ye,et al.  A feature selection method based on modified binary coded ant colony optimization algorithm , 2016, Appl. Soft Comput..

[25]  José Luis Bosque,et al.  Study of neural net training methods in parallel and distributed architectures , 2010, Future Gener. Comput. Syst..

[26]  Witold Jacak,et al.  Analysis of Selected Evolutionary Algorithms in Feature Selection and Parameter Optimization for Data Based Tumor Marker Modeling , 2011, EUROCAST.

[27]  Osvaldo M. Querin,et al.  Optimum design of semi-rigid connections using metamodels , 2012 .

[28]  Rubén Urraca,et al.  Evaluation of a novel GA-based methodology for model structure selection: The GA-PARSIMONY , 2018, Neurocomputing.

[29]  G. Gary Wang,et al.  Review of Metamodeling Techniques in Support of Engineering Design Optimization , 2007 .

[30]  Sheng Ding,et al.  Spectral and Wavelet-based Feature Selection with Particle Swarm Optimization for Hyperspectral Classification , 2011, J. Softw..

[31]  Julio Fern'andez-Ceniceros,et al.  A numerical-informational approach for characterising the ductile behaviour of the T-stub component. Part 1: Refined finite element model and test validation , 2015 .

[32]  Rui Ye,et al.  Considering diversity and accuracy simultaneously for ensemble pruning , 2017, Appl. Soft Comput..

[33]  Christophe Ambroise,et al.  Parsimonious additive models , 2007, Comput. Stat. Data Anal..

[34]  Ruben Morales-Menendez,et al.  Process-Monitoring-for-Quality — A Model Selection Criterion , 2018 .

[35]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[36]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[37]  Mohamed A. Shahin,et al.  Neural networks for modelling ultimate pure bending of steel circular tubes , 2008 .

[38]  Priyanka,et al.  Genetic algorithms tuned expert model for detection of epileptic seizures from EEG signatures , 2014, Appl. Soft Comput..

[39]  Javier Pérez-Rodríguez,et al.  Simultaneous instance and feature selection and weighting using evolutionary computation: Proposal and study , 2015, Appl. Soft Comput..

[40]  H. Akaike A new look at the statistical model identification , 1974 .

[41]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[42]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[43]  João Miguel da Costa Sousa,et al.  Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients , 2013, Appl. Soft Comput..

[44]  Ning Chen,et al.  A genetic algorithm-based approach to cost-sensitive bankruptcy prediction , 2011, Expert Syst. Appl..

[45]  Hui-Ling Huang,et al.  ESVM: Evolutionary support vector machine for automatic feature selection and classification of microarray data , 2007, Biosyst..

[46]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[47]  F. J. Martinez-de-Pison,et al.  Generation of daily global solar irradiation with support vector machines for regression , 2015 .

[48]  Francisco J. Martínez de Pisón Ascacibar,et al.  Searching Parsimonious Solutions with GA-PARSIMONY and XGBoost in High-Dimensional Databases , 2016, SOCO-CISIS-ICEUTE.

[49]  Giovanni Seni,et al.  Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions , 2010, Ensemble Methods in Data Mining.

[50]  Jianming Ye On Measuring and Correcting the Effects of Data Mining and Model Selection , 1998 .

[51]  T. Simpson,et al.  Computationally Inexpensive Metamodel Assessment Strategies , 2002 .

[52]  A. Pernía,et al.  Prediction models for calculating bolted connections using data mining techniques and the finite element method , 2010 .