Maximising Accuracy and Efficiency of Traffic Accident Prediction Combining Information Mining with Computational Intelligence Approaches and Decision Trees

Abstract The development of universal methodologies for the accurate, efficient, and timely prediction of traffic accident location and severity constitutes a crucial endeavour. In this piece of research, the best combinations of salient accident-related parameters and accurate accident severity prediction models are determined for the 2005 accident dataset brought together by the Republic of Cyprus Police. The optimal methodology involves: (a) information mining in the form of feature selection of the accident parameters that maximise prediction accuracy (implemented via scatter search), followed by feature extraction (implemented via principal component analysis) and selection of the minimal number of components that contain the salient information of the original parameters, which combined bring about an overall 74.42% reduction in the dataset dimensionality; (b) accident severity prediction via probabilistic neural networks and random forests, both of which independently accomplish over 96% correct prediction and a balanced proportion of under- and over-estimations of accident severity. An explanation of the superiority of the optimal combinations of parameters and models is given, as is a comparison with existing accident classification/prediction approaches

[1]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[2]  So Young Sohn,et al.  Data fusion, ensemble and clustering to improve the classification accuracy for the severity of road traffic accidents in Korea , 2003 .

[3]  Václav Snásel,et al.  Learning the Classification of Traffic Accident Types , 2012, 2012 Fourth International Conference on Intelligent Networking and Collaborative Systems.

[4]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[5]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  Juan de Oña,et al.  Injury severity models for motor vehicle accidents: a review , 2013 .

[8]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[9]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[10]  S. Goodman Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy , 1999, Annals of Internal Medicine.

[11]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[12]  C. Spearman General intelligence Objectively Determined and Measured , 1904 .

[13]  Ali Tavakoli Kashani,et al.  Analysis of factors associated with traffic injury severity on rural roads in Iran , 2012, Journal of injury & violence research.

[14]  Mohamed Abdel-Aty,et al.  Development of Artificial Neural Network Models to Predict Driver Injury Severity in Traffic Accidents at Signalized Intersections , 2001 .

[15]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  B J Campbell,et al.  ANALYSIS OF THE ACCURACY OF THE EXISTING KABCO INJURY SCALE , 1991 .

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[19]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[20]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[21]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[22]  Lluís A. Belanche Muñoz,et al.  Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[23]  Hirotugu Akaike,et al.  Likelihood and the Bayes procedure , 1980 .

[24]  Dominique Lord,et al.  The statistical analysis of highway crash-injury severities: a review and assessment of methodological alternatives. , 2011, Accident; analysis and prevention.

[25]  Mojtaba Ziyadi,et al.  Prediction of accident severity using artificial neural networks , 2011 .

[26]  Fred L Mannering,et al.  Highway accident severities and the mixed logit model: an exploratory empirical analysis. , 2008, Accident; analysis and prevention.

[27]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[28]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[29]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[30]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[31]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[32]  Worku Y. Mergia,et al.  Exploring factors contributing to injury severity at freeway merging and diverging locations in Ohio. , 2013, Accident; analysis and prevention.

[33]  Li-Yen Chang,et al.  Analysis of driver injury severity in truck-involved accidents using a non-parametric classification tree model , 2013 .

[34]  W. Haddon,et al.  The injury severity score: a method for describing patients with multiple injuries and evaluating emergency care. , 1974, The Journal of trauma.

[35]  J. Hardin,et al.  Generalized Linear Models and Extensions , 2001 .

[36]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[37]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[38]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[39]  Paola Sebastiani,et al.  Robust Bayes classifiers , 2001, Artif. Intell..

[40]  R. Geetha Ramani,et al.  Feature Relevance Analysis and Classification of Road Traffic Accident Data through Data Mining Techniques , 2012 .

[41]  H. Akaike A new look at the statistical model identification , 1974 .

[42]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[43]  R. Geetha Ramani,et al.  Vehicle Safety Device (Airbag) Specific Classification of Road Traffic Accident Patterns through Data Mining Techniques , 2012, ACITY.

[44]  David Heckerman,et al.  Bayesian Networks for Data Mining , 2004, Data Mining and Knowledge Discovery.

[45]  Ramesh Sharda,et al.  Identifying significant predictors of injury severity in traffic accidents using a series of artificial neural networks. , 2006, Accident; analysis and prevention.

[46]  H. Kaiser The Application of Electronic Computers to Factor Analysis , 1960 .

[47]  Ajith Abraham,et al.  Traffic Accident Analysis Using Decision Trees and Neural Networks , 2014 .

[48]  Li-Yen Chang,et al.  Analysis of traffic injury severity: an application of non-parametric classification tree techniques. , 2006, Accident; analysis and prevention.

[49]  F. Glover HEURISTICS FOR INTEGER PROGRAMMING USING SURROGATE CONSTRAINTS , 1977 .

[50]  Tatiana Tambouratzis,et al.  Combining probabilistic neural networks and decision trees for maximally accurate and efficient accident prediction , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[51]  Mohamed Abdel-Aty,et al.  Predicting Injury Severity Levels in Traffic Crashes: A Modeling Comparison , 2004 .

[52]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[53]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[54]  Chao Wang,et al.  Predicting accident frequency at their severity levels and its application in site ranking using a two-stage mixed multivariate model. , 2011, Accident; analysis and prevention.

[55]  L Mussone,et al.  An analysis of urban collisions using an artificial intelligence model. , 1999, Accident; analysis and prevention.

[56]  Paul Damien,et al.  A multivariate Poisson-lognormal regression model for prediction of crash counts by severity, using Bayesian methods. , 2008, Accident; analysis and prevention.

[57]  Belén Melián-Batista,et al.  Solving feature subset selection problem by a Parallel Scatter Search , 2006, Eur. J. Oper. Res..

[58]  Asad J. Khattak,et al.  Are SUVs “Supremely Unsafe Vehicles”?: Analysis of Rollovers and Injuries with Sport Utility Vehicles , 2003 .

[59]  P. McCullagh,et al.  Generalized Linear Models , 1984 .