Perceptrons in Kernel Feature Spaces

The weight vector of a perceptron can be represented in two ways, either in an explicit form where the vector is directly available, or in a data dependant form where the weight is represented by a weighted sum of some training patterns. Kernel functions allow the creation of nonlinear versions of data dependent perceptrons if scalar products are replaced by kernel functions. For Muroga's and Minnick's linear programming perceptron, a data dependent version with kernels and regularisation is presented; the linear programming machine which perform about as well as support vector machines do by only solving LINEAR programs (support vector learning is based on solving QUADRATIC programs). In the decision function of a kernel-based perceptron, nonlinear dependencies between the expansion vectors can exist. These dependencies in kernel feature space can be eliminated in order to compress the decision function without loss by removing redundant expansion vectors updating multipliers. The compression ratio obtained can be considered as a complexity measure similar to, but tighter than, Vapnick's leave-one-out bound.