SAFETY RISK EVALUATIONS OF DEEP FOUNDATION CONSTRUCTION SCHEMES BASED ON IMBALANCED DATA SETS

Safety risk evaluations of deep foundation construction schemes are important to ensure safety. However, the amount of knowledge on these evaluations is large, and the historical data of deep foundation engineering is imbalanced. Some adverse factors influence the quality and efficiency of evaluations using traditional manual evaluation tools. Machine learning guarantees the quality of imbalanced data classifications. In this study, three strategies are proposed to improve the classification accuracy of imbalanced data sets. First, data set information redundancy is reduced using a binary particle swarm optimization algorithm. Then, a classification algorithm is modified using an Adaboost-enhanced support vector machine classifier. Finally, a new classification evaluation standard, namely, the area under the ROC curve, is adopted to ensure the classifier to be impartial to the minority. A transverse comparison experiment using multiple classification algorithms shows that the proposed integrated classification algorithm can overcome difficulties associated with correctly classifying minority samples in imbalanced data sets. The algorithm can also improve construction safety management evaluations, relieve the pressure from the lack of experienced experts accompanying rapid infrastructure construction, and facilitate knowledge reuse in the field of architecture, engineering, and construction.

[1]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[2]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[3]  Jongwon Seo,et al.  Risk-Based Safety Impact Assessment Methodology for Underground Construction Projects in Korea , 2008 .

[4]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[5]  Min An,et al.  Development of Risk Assessment and Occupational Safety Management Model for Building Construction Projects , 2015 .

[6]  Nitesh V. Chawla,et al.  SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[7]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[8]  Xin Zheng,et al.  Metro Construction Safety Risk Assessment Based on the Fuzzy AHP and the Comprehensive Evaluation Method , 2014 .

[9]  Liu Xiao,et al.  Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data , 2016 .

[10]  Zhang Hua-xiang Modified KNN algorithm for multi-label learning , 2011 .

[11]  Chen Gan,et al.  Ontology-based framework for building environmental monitoring and compliance checking under BIM environment , 2018, Building and Environment.

[12]  Abel Pinto,et al.  QRAM a Qualitative Occupational Safety Risk Assessment Model for the construction industry that incorporate uncertainties by the use of fuzzy sets , 2014 .

[13]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[14]  Hui Zhang,et al.  Risk Assessment Methodology for a Deep Foundation Pit Construction Project in Shanghai, China , 2011 .

[15]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[16]  Dongping Fang,et al.  Safety Risk Identification and Assessment for Beijing Olympic Venues Construction , 2008 .

[17]  Xin Liu,et al.  Risk Assessment and Preliminary Study of Safety Management System on Construction Works , 2013 .

[18]  Per Tengborg,et al.  Guidelines for tunnelling risk management: International Tunnelling Association, Working Group No. 2 , 2004 .

[19]  Lieyun Ding,et al.  Ontology-based semantic modeling of regulation constraint for automated construction quality compliance checking , 2012 .

[20]  Qing He,et al.  Real-value negative selection over-sampling for imbalanced data set learning , 2019, Expert Syst. Appl..

[21]  Xianguo Wu,et al.  Safety risk identification system for metro construction on the basis of construction drawings , 2012 .

[22]  D. A. Patel,et al.  Developing a Process to Evaluate Construction Project Safety Hazard Index Using the Possibility Approach in India , 2017 .

[23]  Jonathan E. Fieldsend,et al.  Multi-class ROC analysis from a multi-objective optimisation perspective , 2006, Pattern Recognit. Lett..

[24]  Peter E.D. Love,et al.  Planning of Deep Foundation Construction Technical Specifications Using Improved Case-Based Reasoning with Weighted k-Nearest Neighbors , 2017, J. Comput. Civ. Eng..

[25]  Sherong Zhang,et al.  Real-Time Safety Risk Identification Model during Metro Construction Adjacent to Buildings , 2019, Journal of Construction Engineering and Management.

[26]  Zhou Wei,et al.  Safety risk factors of metro tunnel construction in China: An integrated study with EFA and SEM , 2018, Safety Science.

[27]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[28]  Bi Xue Zhang,et al.  Metro Construction Safety Risk Assessment of Xi'an Based on CIM Model , 2014 .

[29]  Edmund H. Durfee,et al.  Trends in Cooperative Distributed Problem Solving , 1989, IEEE Trans. Knowl. Data Eng..

[30]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[31]  Peter E.D. Love,et al.  Modeling tunnel construction risk dynamics: Addressing the production versus protection problem , 2016 .

[32]  Ling Guan,et al.  Covariance-guided One-Class Support Vector Machine , 2014, Pattern Recognit..

[33]  C. Chen,et al.  Fuzzy comprehensive Bayesian network-based safety risk assessment for metro construction projects , 2017 .

[34]  Li Yijinga,et al.  Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data , 2016 .

[35]  Q. Z. Yang,et al.  Design knowledge modeling and software implementation for building code compliance checking , 2004 .

[36]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[37]  Mikel Galar,et al.  Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches , 2013, Knowl. Based Syst..

[38]  Rui Liu,et al.  Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification , 2019, Inf. Sci..

[39]  Yong Li,et al.  An Ontological and Semantic Approach for the Construction Risk Inferring and Application , 2015, J. Intell. Robotic Syst..

[40]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[41]  Francisco Herrera,et al.  An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..

[42]  A. S. Alnuaimi,et al.  Risk assessment of safety and health (RASH) for building construction , 2015 .

[43]  Cornelius Preidel,et al.  Automated Code Compliance Checking Based on a Visual Language and Building Information Modeling , 2015 .

[44]  Ying Cao,et al.  Advance and Prospects of AdaBoost Algorithm , 2013, ACTA AUTOMATICA SINICA.

[45]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[46]  Hanbin Luo,et al.  A BIM-based Code Compliance Checking Process of Deep Foundation Construction Plans , 2015, J. Intell. Robotic Syst..

[47]  Amin Hammad,et al.  Automated Code Compliance Checking for Building Envelope Design , 2010, J. Comput. Civ. Eng..

[48]  Li-Yeh Chuang,et al.  Improved binary PSO for feature selection using gene expression data , 2008, Comput. Biol. Chem..

[49]  Gerald Schaefer,et al.  An improved ensemble approach for imbalanced classification problems , 2013, 2013 IEEE 8th International Symposium on Applied Computational Intelligence and Informatics (SACI).

[50]  Chou-Yuan Lee,et al.  A novel algorithm applied to classify unbalanced data , 2012, Appl. Soft Comput..

[51]  Hong Mei Cao Research on the Risk Assessment for the Construction Safety in the Planning and Design Stages of Bridge Engineering , 2014 .

[52]  Nuno Vasconcelos,et al.  Cost-Sensitive Support Vector Machines , 2012, Neurocomputing.

[53]  Seunghee Park,et al.  The development of a web-based construction safety management information system to improve risk assessment , 2015 .

[54]  James Bailey,et al.  A Novel Scalable Multi-class ROC for Effective Visualization and Computation , 2010, PAKDD.

[55]  Lieyun Ding,et al.  Application of 4D visualization technology for safety management in metro construction , 2013 .