Applying a Hybrid Data Mining Approach to Develop Carotid Artery Prediction Models

This paper performs a hybrid method for imbalanced medical data set with many features on it. A synthetic minority over-sampling technique (SMOTE) is used to solve two-class imbalanced problems. This method enhanced the significance of the small and specific region belonging to the positive class in the decision region. The SMOTE is applied to generate synthetic instances for the positive class to balance the training data set. Another method that used is Genetic Algorithm for feature selection. The proposed of this method is to receive the reduced redundancy of information among the selected features. On the other hand, this method emphasizes on selecting a subset of salient features with reduced number using a subset size determination scheme. Towards the end, selected features would be processed using back Propagation Network (NN) and Decision Tree to predict the accuracy of Carotid Artery Disease. Experimental results show that these methods achieved a high accuracy, so it can assist the doctors to provide some possibilities information to the patient.