Reducing Kernel Matrix Diagonal Dominance Using Semi-definite Programming

Kernel-based learning methods revolve around the notion of a kernel or Gram matrix between data points. These square, symmetric, positive semi-definite matrices can informally be regarded as encoding pairwise similarity between all of the objects in a data-set. In this paper we propose an algorithm for manipulating the diagonal entries of a kernel matrix using semi-definite programming. Kernel matrix diagonal dominance reduction attempts to deal with the problem of learning with almost orthogonal features, a phenomenon commonplace in kernel matrices derived from string kernels or Gaussian kernels with small width parameter. We show how this task can be formulated as a semi-definite programming optimization problem that can be solved with readily available optimizers. Theoretically we provide an analysis using Rademacher based bounds to provide an alternative motivation for the 1-norm SVM motivated from kernel diagonal reduction. We assess the performance of the algorithm on standard data sets with encouraging results in terms of approximation and prediction.