A Fast SVM Classification Learning Algorithm Used to Large Training Set

SVM may have great difficulty in its realization, even can not work properly because of the tremendous increase of compute time and memory for large-scale training set. A new fast learning algorithm for large-scale SVM is proposed under the condition of sample aliasing. The aliasing sample points which are not the same class are eliminated first and then the relative boundary vectors (RBVs) are computed. According to the algorithm, not only the RBV sample itself, but a near RBV sample whose distance to the RBV is smaller than a certain value will also be selected for SVM training in order to prevent the loss of some critical sample points for the optimal hyper plane. The selected training samples after pruning are essential and their number is only about 1/3~1/4, even 1/7~1/10 of the total number of the original training samples. If we use these final samples for SVM training, the training-time can be shorten remarkably and the training-speed will also be improved quite a lot. The most important fact is that the classification accuracy may be kept almost the same as that obtained when the large-scale sample set is used directly for training. The simulation results prove this fast learning algorithm very effective and can be used as a good practical approach for large-scale SVM training.