A new weighted support vector machine with GA-based parameter selection

When training sets with uneven class sizes are used, the classification error based on C-support vector machine is undesirably biased towards the class with smaller training set. When training with multi-duplicated samples, C-SVM depends on each sample leading to more time for training. A new weighted support vector machine algorithm is proposed based on the analysis of the cause of such problems, which compensates for the unfavorable impact caused by the uneven class sizes and makes the decision speed faster. To obtain a good generalization performance, genetic algorithm is used to tune the regularization parameter and parameter of the kernel function when training the model. Experiments show that the proposed approach can control the misclassification error rates of classes and deal with multi-duplicate samples with good generalization performance.