A Data Transformation Technique for Car Injury Prediction

Prediction and classification are very important functions for a knowledge discovery system. To improve the traffic safety, a prediction model using a neural network is proposed as part of knowledge discovery enhancement for the Critical Analysis Reporting Environment (CARE). CARE was designed to provide information to assist in the reduction of fatalities, injuries, and economic losses caused by car crashes. In CARE, the entire data is encoded in categorical format; the neural network processing cannot be done by just simply feeding the system with the current categorical data. Suitable data transformation has to be done before a neural network can achieve its highest accuracy potential. In this study, the data transformation theme with a frequency based scheme is used to transform categorical codes into numerical values for a car injury prediction model. During data transformation, additional prior knowledge from traffic safety domain is also incorporated into the coding procedure. We also propose a modified frequency based scheme, which yields better prediction results. The 1997 alcohol-related crash records occurring on the Alabama interstate were used to evaluate our proposed approach. By comparing the computational results, the proposed modified frequency based technique outperforms the traditional 1-to-N binary code method with higher accuracy and efficiency.