Relevant Feature Selection for Predicting the Severity of Motorcycle Accident in Thailand

Thailand is the 5th ranked for road accident death toll in the world and the first ranked in Asia, which is considered a major problem in Thailand. Road accidents are an important problem that affects the quality of life of people and the country's economy. In Thailand, most of road accidents happening with motorcycles. There is the need of study for factors effecting these accidents. In this work, we applied feature selection and classification techniques for analyzing important factors causing road accidents on the dataset of motorcycle accidents. In particular, the experiment compared the performance of K-Nearest Neighbor classification models trained from (i) dataset with all features and (ii) dataset with selected features from the Wrapper technique. It was found that there was no significant difference, so the selected features could represent the models that were similar to the original ones. These selected features are the main contribution of this work since there are potential factors that can cause road accidents. This finding yield an insight information that can be incorporated as a future prevention plan for Thailand and other neighboring countries with similar environment.

[1]  V. Radha,et al.  A literature review of feature selection techniques and applications: Review of feature selection in data mining , 2014 .

[2]  Amit Kumar Das,et al.  Road Accident Analysis and Prediction of Accident Severity by Using Machine Learning in Bangladesh , 2019, 2019 7th International Conference on Smart Computing & Communications (ICSCC).

[3]  Evgeny N. Petrov,et al.  Practical efficiency evaluation method for bibliographical data classification , 2018, 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus).

[4]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[5]  Robert Epstein,et al.  The Quest for the Thinking Computer , 1992, AI Mag..

[6]  Dong Si,et al.  Prediction of Injuries and Fatalities in Aviation Accidents through Machine Learning , 2017, ICCDA '17.

[7]  Stevan Harnad,et al.  The Annotation Game: On Turing (1950) on Computing, Machinery, and Intelligence (PUBLISHED VERSION BOWDLERIZED) , 2006 .

[8]  Liu Yang,et al.  An Improved Random Forest Algorithm Based on Attribute Compatibility , 2019, 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC).

[9]  Nikola Bogunovic,et al.  A review of feature selection methods with applications , 2015, 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[10]  Sandro Sperandei,et al.  Understanding logistic regression analysis , 2014, Biochemia medica.

[11]  Heinrich Matzinger,et al.  Naive Bayes with Correlation Factor for Text Classification Problem , 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA).

[12]  N K Suchetha,et al.  Comparing the Wrapper Feature Selection Evaluators on Twitter Sentiment Classification , 2019, 2019 International Conference on Computational Intelligence in Data Science (ICCIDS).

[13]  Adio Gboyega,et al.  Factors Influencing High Rate of Commercial Motorcycle Accidents in Nigeria , 2012 .

[14]  Chih-Fong Tsai An evaluation methodology for binary pattern classification systems , 2010, 2010 IEEE International Conference on Industrial Engineering and Engineering Management.

[15]  Okfalisa,et al.  Comparative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification , 2017, 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE).

[16]  Siddalingeshwar Patil,et al.  Accuracy Prediction for Distributed Decision Tree using Machine Learning approach , 2019, 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI).

[17]  Vipin Kumar,et al.  Feature Selection: A literature Review , 2014, Smart Comput. Rev..

[18]  Javier Segovia,et al.  A Data Mining & Knowledge Discovery Process Model , 2009 .