论文信息 - Distance-Based Selection of Potential Support Vectors by Kernel Matrix

Distance-Based Selection of Potential Support Vectors by Kernel Matrix

We follow the idea of decomposing a large data set into smaller groups, and present a novel distance-based method of selecting potential support vectors in each group by means of kernel matrix. Potential support vectors selected in the previous group are passed to the next group for further selection. Quadratic programming is performed only once, on the potential support vectors still retained in the last group, for the construction of an optimal hyperplane. We avoid solving unnecessary quadratic programming problems at intermediate stages, and can take control over the number of selected potential support vectors to cope with the limitations of memory capacity and existing optimizers’ capability. Since this distance-based method does not work on data containing outliers and noises, we introduce the idea of separating outliers/noises and the base, by use of the k-nearest neighbor algorithm, to improve generalization ability. Two optimal hyperplanes are constructed on the base part and the outlier/noise part, respectively, which are then synthesized to derive the optimal hyperplane on the overall data.

Baoqing Li

[1] Bernhard Schölkopf,et al. The Kernel Trick for Distances , 2000, NIPS.

[3] Federico Girosi,et al. Support Vector Machines: Training and Applications , 1997 .

[4] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[5] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.