A kernel-based Perceptron with dynamic memory

In this study, we propose a dynamical memory strategy to efficiently control the size of the support set in a kernel-based Perceptron learning algorithm. The method consists of two operations, namely, the incremental and decremental projections. In the incremental projection, a new presented instance is either added to the support set or discarded depending on a predefined rule. To diminish information loss, we do not throw away those discarded examples cheaply, instead their impact to the discriminative function is sustained by a projection technique, which maps the modified discriminative function into the space spanned by the original support set. When a new example is added to the support set, the algorithm moves to the decremental projection. We evaluate the minimum information loss by deleting one instance from the support set. If this minimum information loss is less than a tolerable threshold, then the corresponding instance is removed, however, its contribution to the discriminative function is reserved by the projection technique. By this, our method can on one hand keep a relatively small size of the support set and on the other hand achieve a high classification accuracy. We also develop a method which sets a budget for the size of the support set. We test our approaches to four benchmark data sets, and find that our methods outperform others in either having higher classification accuracies when the sizes of their support sets are comparable or having smaller sizes of the support sets when their classification accuracies are similar.

[1]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[2]  Barbara Caputo,et al.  Bounded Kernel-Based Online Learning , 2009, J. Mach. Learn. Res..

[3]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[4]  Manfred Opper,et al.  Sparse Representation for Gaussian Process Models , 2000, NIPS.

[5]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[6]  Claudio Gentile,et al.  Tracking the best hyperplane with a simple budget Perceptron , 2006, Machine Learning.

[7]  Koby Crammer,et al.  Multi-Class Pegasos on a Budget , 2010, ICML.

[8]  Alexander J. Smola,et al.  Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[9]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[10]  Barbara Caputo,et al.  The projectron: a bounded kernel-based Perceptron , 2008, ICML '08.

[11]  Shie Mannor,et al.  Sparse Online Greedy Support Vector Regression , 2002, ECML.

[12]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[13]  Claudio Gentile,et al.  Tracking the Best Hyperplane with a Simple Budget Perceptron , 2006, COLT.

[14]  Tom Downs,et al.  Exact Simplification of Support Vector Solutions , 2002, J. Mach. Learn. Res..

[15]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[16]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[17]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[18]  Dale Schuurmans,et al.  implicit Online Learning with Kernels , 2006, NIPS.

[19]  Koby Crammer,et al.  Online Classification on a Budget , 2003, NIPS.

[20]  Wenwu He,et al.  Limited Stochastic Meta-Descent for Kernel-Based Online Learning , 2009, Neural Computation.

[21]  Yoram Singer,et al.  The Forgetron: A Kernel-Based Perceptron on a Budget , 2008, SIAM J. Comput..

[22]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.