AI Predicts Independent Construction Safety Outcomes from Universal Attributes

This paper significantly improves on, and finishes to validate, the approach proposed in "Application of Machine Learning to Construction Injury Prediction" (Tixier et al. 2016 [1]). Like in the original study, we use NLP to extract fundamental attributes from raw incident reports and machine learning models are trained to predict safety outcomes (here, these outcomes are injury severity, injury type, bodypart impacted, and incident type). However, in this study, safety outcomes were not extracted via NLP but are independent (human annotations), eliminating any potential source of artificial correlation between predictors and predictands. Results show that attributes are still highly predictive, confirming the validity of the original study. Other improvements brought by the current study include the use of (1) a much larger dataset, (2) two new models (XGBoost andlinear SVM), (3) model stacking, (4) a more straight forward experimental setup with more appropriate performance metrics, and (5) an analysis of per-category attribute importance scores. Finally, the injury severity outcome is well predicted, which was not the case in the original study. This is a significant advancement.

[1]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[2]  Matthew R. Hallowell,et al.  Automated content analysis for construction safety: A natural language processing system to extract precursors and outcomes from unstructured injury reports , 2016 .

[3]  Hans C. van Houwelingen,et al.  The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .

[4]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Matthew R. Hallowell,et al.  Attribute-Based Safety Risk Assessment. II: Predicting Safety Outcomes Using Generalized Linear Models , 2015 .

[7]  Jhareswar Maiti,et al.  An optimization-based decision tree approach for predicting slip-trip-fall accidents at work , 2019, Safety Science.

[8]  Marc Prades Villanova,et al.  Attribute-based Risk Model for Assessing Risk to Industrial Construction Tasks , 2014 .

[9]  Tomasz Arciszewski,et al.  CONSTRUCTABILITY ANALYSIS: MACHINE LEARNING ApPROACH , 1997 .

[10]  Matthew R. Hallowell,et al.  Methods of safety prediction: analysis and integration of risk assessment, leading indicators, precursor analysis, and safety climate , 2020, Construction Management and Economics.

[11]  William J. Wiatrowski,et al.  Comparing Fatal Work Injuries in the United States and the European Union , 2014 .

[12]  Ekambaram Palaneeswaran,et al.  A support vector machine model for contractor prequalification , 2009 .

[13]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[14]  Lucio Soibelman,et al.  Data Preparation Process for Construction Knowledge Generation through Knowledge Discovery in Databases , 2002 .

[15]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[16]  Enno Koehn,et al.  OSHA Regulations Effects on Construction , 1983 .

[17]  J. Weston,et al.  Support Vector Machine Solvers , 2007 .

[18]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[19]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[20]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[21]  Matthew R. Hallowell,et al.  Empirical measurement and improvement of hazard recognition skill , 2017 .

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Matthew R. Hallowell,et al.  Automatically Learning Construction Injury Precursors from Text , 2019, Automation in Construction.

[24]  John A. Gambatese,et al.  Energy-based safety risk assessment: does magnitude and intensity of energy predict injury severity? , 2017 .

[25]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[26]  Zhipeng Zhou,et al.  Overview and analysis of safety management studies in the construction industry , 2015 .

[27]  R.K.B. Hubbard,et al.  Major and minor accidents at the Thames barrier construction site , 1985 .

[28]  Amir H. Behzadan,et al.  Construction Productivity and Ergonomic Assessment Using Mobile Sensors and Machine Learning , 2017 .

[29]  H. Son,et al.  Automated Color Model-Based Concrete Detection in Construction-Site Images by Using Machine Learning Algorithms , 2012, J. Comput. Civ. Eng..

[30]  Matthew R. Hallowell,et al.  Application of machine learning to construction injury prediction , 2016 .

[31]  Tarek Hegazy,et al.  Neural networks as tools in construction , 1991 .

[32]  Matthieu Desvignes,et al.  Requisite empirical risk data for integration of safety with advanced technologies and intelligent systems , 2014 .

[33]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[34]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[35]  Hanguk Ryu,et al.  Predicting types of occupational accidents at construction sites in Korea using random forest model , 2019 .

[36]  L. Breiman OUT-OF-BAG ESTIMATION , 1996 .

[37]  Matthew R. Hallowell,et al.  Construction Safety Clash Detection: Identifying Safety Incompatibilities among Fundamental Attributes using Data Mining , 2017 .

[38]  Amir H. Behzadan,et al.  Construction equipment activity recognition for simulation input modeling using mobile sensors and machine learning classifiers , 2015, Adv. Eng. Informatics.

[39]  Yang Miang Goh,et al.  Safety leading indicators for construction sites: A machine learning approach , 2018, Automation in Construction.

[40]  Jhareswar Maiti,et al.  Application of optimized machine learning techniques for prediction of occupational accidents , 2019, Comput. Oper. Res..

[41]  Min-Yuan Cheng,et al.  Estimate at Completion for construction projects using Evolutionary Support Vector Machine Inference Model , 2010 .

[42]  Simo Salminen Serious occupational accidents in the construction industry , 1995 .

[43]  Antoine J.-P. Tixier,et al.  Notes on Deep Learning for NLP , 2018, ArXiv.