Machine learning for analysis of occupational accidents registration data

Regardless of the efforts of employers and public organizations to eliminate occupational accidents, the latter is a persistent problem in the construction industry. In the Swedish construction context, there is a desire to identify causes and factors playing a role in work-related accident prevention, as there are large underused databases of collected registrations that represent knowledge on causes and the context of accidents. The aim of the current contribution is to review the application of machine learning (ML) in the improved prevention of accidents and corresponding injuries, and to identify current limitations - and most importantly to answer the question of whether ML actually reveals more than what is currently known about accidents in construction. A systematic literature review on the use of ML for analysing data of accident records was carried out. In the reviewed literature, ML was applied in the prediction of accidents or their outcome, and the extraction or identification of the causes affecting the risks of injuries. ML combined with data mining (DM) techniques such as Natural Language Processing and graph mining, appears to be beneficial in discovering associations between different features and in multiple levels of clusters. However, the literature shows that research on ML in accident prevention is at an early stage. The review of the literature indicates gaps in the justification of methodological choices, such as the choice of ML method and data processing. Moreover, characteristics of the injury rates and severity are shown to be clashing with the mechanisms of the ML classification algorithms. This should probably lead to abandoning severity as a parameter and changing the approach towards the asymmetric data classes (denoted "unbalanced" in ML methodology), leaving space for finding the important causes. An overreliance on internal validity testing and lack of external testing of the algorithms’ performance and prediction accuracy persists. Future research needs to focus on methods addressing the problem of data processing, explaining the choice of methods, explaining the results (especially the variance in ML algorithm’s performance), merging different data sources, considering more attributes (such as risk management), applying deep learning algorithms, and improving the testing accuracy of ML models.

[1]  Matthew R. Hallowell,et al.  Automated content analysis for construction safety: A natural language processing system to extract precursors and outcomes from unstructured injury reports , 2016 .

[2]  Lukumon O. Oyedele,et al.  Big Data in the construction industry: A review of present status, opportunities, and future trends , 2016, Adv. Eng. Informatics.

[3]  Sangyoon Chin,et al.  Machine learning predictive model based on national data for fatal accidents of construction workers , 2020 .

[4]  Matthew R. Hallowell,et al.  Construction Safety Clash Detection: Identifying Safety Incompatibilities among Fundamental Attributes using Data Mining , 2017 .

[5]  Lars Harms-Ringdahl,et al.  Guide To Safety Analysis for Accident Prevention , 2013 .

[6]  M. MacLure,et al.  ‘Clarity bordering on stupidity’: where’s the quality in systematic review? , 2005 .

[7]  Onur Behzat Tokdemir,et al.  Predicting the outcome of construction incidents , 2019, Safety Science.

[8]  T. Greenhalgh,et al.  Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources , 2005, BMJ : British Medical Journal.

[9]  Hasan Fleyeh,et al.  Construction site accident analysis using text mining and natural language processing techniques , 2019, Automation in Construction.

[10]  Richard T. Watson,et al.  Analysing the past to prepare for the future: Writing a literature review a roadmap for release 2.0 , 2020, J. Decis. Syst..

[11]  Dimosthenis Kifokeris,et al.  Construction planning with machine learning , 2019 .

[12]  Carl Rollenhagen,et al.  Event investigations at nuclear power plants in Sweden: Reflections about a method and some associated practices , 2011 .

[13]  Børge Rokseth,et al.  Applications of machine learning methods for engineering risk assessment – A review , 2020, Safety Science.

[14]  Lukumon O. Oyedele,et al.  Guidelines for applied machine learning in construction industry - A case of profit margins estimation , 2020, Adv. Eng. Informatics.

[15]  Kirsten Vallmuur,et al.  Machine learning approaches to analysing textual injury surveillance data: a systematic review. , 2015, Accident; analysis and prevention.

[16]  Yang Miang Goh,et al.  Safety leading indicators for construction sites: A machine learning approach , 2018, Automation in Construction.