A widely-used tool for binary classification is the Support Vector Machine (SVM), a supervised learning technique that finds the "maximum margin" linear separator between the two classes. While SVMs have been well studied in the batch (offline) setting, there is considerably less work on the streaming (online) setting, which requires only a single pass over the data using sub-linear space. Existing streaming algorithms are not yet competitive with the batch implementation. In this paper, we use the formulation of the SVM as a minimum enclosing ball (MEB) problem to provide a streaming SVM algorithm based off of the blurred ball cover originally proposed by Agarwal and Sharathkumar. Our implementation consistently outperforms existing streaming SVM approaches and provides higher accuracies than libSVM on several datasets, thus making it competitive with the standard SVM batch implementation.
[1]
Vladimir Vapnik,et al.
Statistical learning theory
,
1998
.
[2]
Jason Weston,et al.
Fast Kernel Classifiers with Online and Active Learning
,
2005,
J. Mach. Learn. Res..
[3]
F ROSENBLATT,et al.
The perceptron: a probabilistic model for information storage and organization in the brain.
,
1958,
Psychological review.
[4]
Kenneth L. Clarkson,et al.
Optimal core-sets for balls
,
2008,
Comput. Geom..
[5]
Timothy M. Chan,et al.
Streaming and Dynamic Algorithms for Minimum Enclosing Balls in High Dimensions
,
2011,
WADS.
[6]
Suresh Venkatasubramanian,et al.
Streamed Learning: One-Pass SVMs
,
2009,
IJCAI.
[7]
Pankaj K. Agarwal,et al.
Streaming Algorithms for Extent Problems in High Dimensions
,
2010,
SODA '10.
[8]
Ivor W. Tsang,et al.
Core Vector Machines: Fast SVM Training on Very Large Data Sets
,
2005,
J. Mach. Learn. Res..