Sparse Bayesian Modeling With Adaptive Kernel Learning

Sparse kernel methods are very efficient in solving regression and classification problems. The sparsity and performance of these methods depend on selecting an appropriate kernel function, which is typically achieved using a cross-validation procedure. In this paper, we propose an incremental method for supervised learning, which is similar to the relevance vector machine (RVM) but also learns the parameters of the kernels during model training. Specifically, we learn different parameter values for each kernel, resulting in a very flexible model. In order to avoid overfitting, we use a sparsity enforcing prior that controls the effective number of parameters of the model. We present experimental results on artificial data to demonstrate the advantages of the proposed method and we provide a comparison with the typical RVM on several commonly used regression and classification data sets.

[1]  Simon Rogers,et al.  Hierarchic Bayesian models for kernel learning , 2005, ICML.

[2]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[3]  Roberto Cipolla,et al.  Real-Time Adaptive Hand Motion Recognition Using a Sparse Bayesian Classifier , 2005, ICCV-HCI.

[4]  Richard M. Everson,et al.  Smooth relevance vector machine: a smoothness prior extension of the RVM , 2007, Machine Learning.

[5]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[6]  Ankur Agarwal,et al.  3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[7]  Andrew Blake,et al.  Sparse Bayesian learning for efficient visual tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Xu Chen,et al.  Bayesian Kernel Methods for Analysis of Functional Neuroimages , 2007, IEEE Transactions on Medical Imaging.

[9]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[10]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[11]  Michael E. Tipping,et al.  Analysis of Sparse Bayesian Learning , 2001, NIPS.

[12]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[13]  Neil D. Lawrence,et al.  A variational approach to robust Bayesian interpolation , 2003, 2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718).

[14]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[15]  Joaquin Quiñonero Candela,et al.  Time series prediction based on the Relevance Vector Machine with adaptive kernels , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  A. F. M. Smith,et al.  Bayesian Wavelet Analysis with a Model Complexity Prior , 1998 .

[17]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[18]  David B Allison,et al.  Applications of Bayesian Statistical Methods in Microarray Data Analysis , 2004, American journal of pharmacogenomics : genomics-related research in drug development and clinical practice.

[19]  Nikolas P. Galatsanos,et al.  Large Scale Multikernel Relevance Vector Machine for Object Detection , 2007, Int. J. Artif. Intell. Tools.

[20]  Michael E. Tipping,et al.  Fast Marginal Likelihood Maximisation for Sparse Bayesian Models , 2003 .

[21]  David Haussler,et al.  Convolution Kernels on Discrete Structures UCSC CRL , 1999 .

[22]  Robert M. Nishikawa,et al.  Relevance vector machine for automatic detection of clustered microcalcifications , 2005, IEEE Transactions on Medical Imaging.

[23]  Yi Li,et al.  Bayesian automatic relevance determination algorithms for classifying gene expression data. , 2002, Bioinformatics.

[24]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[25]  Lawrence Carin,et al.  A Bayesian approach to joint feature selection and classifier design , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.