Non-sparse Multiple Kernel Learning

Approaches to multiple kernel learning (MKL) employ l1-norm constraints on the mixing coefficients to promote sparse kernel combinations. When features encode orthogonal characterizations of a problem, sparseness may lead to discarding useful information and may thus result in poor generalization p erformance. We study non-sparse multiple kernel learning by imposing an l2-norm constraint on the mixing coefficients. Empirically, l2-MKL proves robust against noisy and redundant feature sets and significantly improves the promoter de t ction rate compared to l1-norm and canonical MKL on large scales.