Application of optimized machine learning techniques for prediction of occupational accidents

Abstract Although, the usefulness of the machine learning (ML) technique in predicting future outcomes has been established in different domains of applications (e.g., heath care), its exploration in predicting accidents in occupational safety domain is almost new. This necessitates the investigation of ML techniques in predicting accidents. But, ML-based algorithms cannot produce the best performance until its parameters are properly tuned or optimized. Moreover, only the selection of efficient optimized classifier may not fulfil the overall decision-making purposes as it cannot explain the inter-relationships among the factors behind the occurrence of accidents. Hence, in addition to prediction, decision-making rules are required to be extracted from the accident data. Considering the above-mentioned issues, in this research, optimized machine learning algorithms have been applied to predict the accident outcomes such as injury, near miss, and property damage using occupational accident data. Two popular machine learning algorithms, namely support vector machine (SVM) and artificial neural network (ANN) have been used whose parameters are optimized by two powerful optimization algorithms, namely genetic algorithm (GA) and particle swarm optimization (PSO) in order to achieve higher degree of accuracy and robustness. PSO-based SVM outperforms the other algorithms with the highest level of accuracy and robustness. Furthermore, rules are extracted by incorporating decision tree C5.0 algorithm with PSO-based SVM model. Finally, a set of nine useful rules are extracted to identify the root causes of the injury, near miss and property damage cases. A case study from a steel plant is presented in support of the proposed methodology.

[1]  Mahdi Alikhani,et al.  Presentation of clustering-classification heuristic method for improvement accuracy in classification of severity of road accidents in Iran , 2013 .

[2]  Vili Podgorelec,et al.  Text classification method based on self-training and LDA topic models , 2017, Expert Syst. Appl..

[3]  Evangelos Triantaphyllou,et al.  A meta-heuristic approach for improving the accuracy in some classification algorithms , 2011, Comput. Oper. Res..

[4]  Donald E. Brown,et al.  Text Mining the Contributors to Rail Accidents , 2016, IEEE Transactions on Intelligent Transportation Systems.

[5]  Barak Aviad,et al.  Classification by clustering decision tree-like classifier based on adjusted clusters , 2011 .

[6]  Glenn Fung,et al.  Rule extraction from linear support vector machines , 2005, KDD '05.

[7]  R A Lyons,et al.  Routine narrative analysis as a screening tool to improve data quality , 2003, Injury prevention : journal of the International Society for Child and Adolescent Injury Prevention.

[8]  Jie Li,et al.  Knowledge Distribution and Text Mining of International Aviation Safety Research , 2015 .

[9]  Andrew P. Bradley,et al.  Rule Extraction from Support Vector Machines: A Sequential Covering Approach , 2007, IEEE Transactions on Knowledge and Data Engineering.

[10]  Min-Yuan Cheng,et al.  Optimizing parameters of support vector machine using fast messy genetic algorithm for dispute classification , 2014, Expert Syst. Appl..

[11]  Sou-Sen Leu,et al.  Bayesian-network-based safety risk assessment for steel construction projects. , 2013, Accident; analysis and prevention.

[12]  Javier Taboada,et al.  Explaining and predicting workplace accidents using data-mining techniques , 2011, Reliab. Eng. Syst. Saf..

[13]  Qiang Li,et al.  Parameters Optimization of Back Propagation Neural Network Based on Memetic Algorithm Coupled with Genetic Algorithm , 2015, 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom).

[14]  W. Haddon,et al.  Accident research - methods and approaches , 1965 .

[15]  Josep M. Rossell,et al.  Study of Spanish mining accidents using data mining techniques , 2015 .

[16]  Sheng Tang,et al.  A density-based method for adaptive LDA model selection , 2009, Neurocomputing.

[17]  Yoshua Bengio,et al.  STDP-Compatible Approximation of Backpropagation in an Energy-Based Model , 2017, Neural Computation.

[18]  Moshe Ben-Akiva,et al.  Text analysis in incident duration prediction , 2013 .

[19]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[20]  Gordan H. Robinson Accidents and sociotechnical systems: principles for design , 1982 .

[21]  Xinhua Xue,et al.  Seismic liquefaction potential assessed by neural networks , 2017, Environmental Earth Sciences.

[22]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[23]  J. Kunce Vocational interests and accident proneness. , 1967, The Journal of applied psychology.

[24]  Guangdong Tian,et al.  Green material selection for sustainability: A hybrid MCDM approach , 2017, PloS one.

[25]  Matthew R. Hallowell,et al.  Application of machine learning to construction injury prediction , 2016 .

[26]  Joachim Diederich,et al.  Learning-Based Rule-Extraction From Support Vector Machines: Performance On Benchmark Data Sets , 2004 .

[27]  Andreu Català,et al.  Rule Extraction from Radial Basis Function Networks by Using Support Vectors , 2002, IBERAMIA.

[28]  Pier Alda Ferrari,et al.  A sequential distance-based approach for imputing missing data: Forward Imputation , 2017, Adv. Data Anal. Classif..

[29]  Patrice Bellot,et al.  Accurate and effective latent concept modeling for ad hoc information retrieval , 2014, Document Numérique.

[30]  Javier Taboada,et al.  A Bayesian network analysis of workplace accidents caused by falls from a height , 2009 .

[31]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[32]  José Sergio Ruiz Castilla,et al.  PSO-based method for SVM classification on skewed data sets , 2017, Neurocomputing.

[33]  Sang Won Yoon,et al.  Predictive modeling of hospital readmissions using metaheuristics and data mining , 2015, Expert Syst. Appl..

[34]  Xiuju Fu,et al.  Extracting the knowledge embedded in support vector machines , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[35]  José M. Matías,et al.  A machine learning methodology for the analysis of workplace accidents , 2008, Int. J. Comput. Math..

[36]  Yan Lin,et al.  Missing value imputation in high-dimensional phenomic data: imputable or not, and how? , 2014, BMC Bioinformatics.

[37]  Vojislav Kecman,et al.  Support Vector Machines – An Introduction , 2005 .

[38]  Wenxue Chen,et al.  Classification technique for danger classes of coal and gas outburst in deep coal mines , 2010 .

[39]  David L. Olson,et al.  Comparative analysis of data mining methods for bankruptcy prediction , 2012, Decis. Support Syst..

[40]  Vasile Rus,et al.  Experiments with Semantic Similarity Measures Based on LDA and LSA , 2013, SLSP.

[41]  Ronny Lardner,et al.  Accidents in Perspective , 2005 .

[42]  Robert Henning,et al.  A Conceptual Framework for Integrating Workplace Health Promotion and Occupational Ergonomics Programs , 2009, Public health reports.

[43]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[44]  Balbir S. Dhillon,et al.  Proceedings of the 15th International Conference on Man–Machine–Environment System Engineering , 2015 .

[45]  Aise Zülal Sevkli,et al.  Predicting quality of life for lung transplant recipients: A hybrid genetic algorithms-based methodology , 2017, 2017 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT).

[46]  M. Narasimha Murty,et al.  On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations , 2010, PAKDD.

[47]  P. Ray,et al.  Occupational injury and accident research: A comprehensive review , 2012 .

[48]  Senlin Luo,et al.  Rule Extraction From Support Vector Machines Using Ensemble Learning Approach: An Application for Diagnosis of Diabetes , 2015, IEEE Journal of Biomedical and Health Informatics.

[49]  P. J. García Nieto,et al.  Prediction of work-related accidents according to working conditions using support vector machines , 2011, Appl. Math. Comput..

[50]  Abdelkamel Tari,et al.  Dimensionality reduction in data mining: A Copula approach , 2016, Expert Syst. Appl..

[51]  Dursun Delen,et al.  Development of a structural equation modeling-based decision tree methodology for the analysis of lung transplantations , 2011, Decis. Support Syst..

[52]  Kirsten Vallmuur,et al.  Machine learning approaches to analysing textual injury surveillance data: a systematic review. , 2015, Accident; analysis and prevention.

[53]  V. D. Tsoukalas,et al.  An adaptive neuro-fuzzy inference system (anfis) model for assessing occupational risk in the shipbuilding industry , 2014 .

[54]  Li Wei Wei,et al.  Data Classification Using Support Vector Machines with Mixture Kernels , 2013 .

[55]  Wen Yi,et al.  Development of an early-warning system for site work in hot and humid environments: A case study , 2016 .

[56]  Prasant Kumar Pattnaik,et al.  Artificial Neural Network trained by Particle Swarm Optimization for non-linear channel equalization , 2014, Expert Syst. Appl..

[57]  M. Bevilacqua,et al.  Industrial and occupational ergonomics in the petrochemical process industry: a regression trees approach. , 2008, Accident; analysis and prevention.