Asymmetric kernel in Gaussian Processes for learning target variance

This work incorporates the multi-modality of the data distribution into a Gaussian Process regression model. We approach the problem from a discriminative perspective by learning, jointly over the training data, the target space variance in the neighborhood of a certain sample through metric learning. We start by using data centers rather than all training samples. Subsequently, each center selects an individualized kernel metric. This enables each center to adjust the kernel space in its vicinity in correspondence with the topology of the targets --- a multi-modal approach. We additionally add descriptiveness by allowing each center to learn a precision matrix. We demonstrate empirically the reliability of the model.

[1]  Nuno Vasconcelos,et al.  Counting People With Low-Level Features and Bayesian Regression , 2012, IEEE Transactions on Image Processing.

[2]  Andreas Buja,et al.  Stress functions for nonlinear dimension reduction, proximity analysis, and graph drawing , 2013, J. Mach. Learn. Res..

[3]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[4]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[5]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[6]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[7]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[8]  Chao Yuan,et al.  Variational Mixture of Gaussian Process Experts , 2008, NIPS.

[9]  Volker Tresp,et al.  Mixtures of Gaussian Processes , 2000, NIPS.

[10]  Wenye Li Gaussian Process Learning: A Divide-and-Conquer Approach , 2014, ISNN.

[11]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[12]  David J. Fleet,et al.  Efficient Optimization for Sparse Gaussian Process Regression , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Stephen Tyree,et al.  Non-linear Metric Learning , 2012, NIPS.

[14]  Mark J. Schervish,et al.  Nonstationary Covariance Functions for Gaussian Process Regression , 2003, NIPS.

[15]  Malte Kuss,et al.  Approximate inference for robust Gaussian process regression , 2005 .

[16]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[17]  Ramprasaath R. Selvaraju,et al.  Counting Everyday Objects in Everyday Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Simon Osindero,et al.  An Alternative Infinite Mixture Of Gaussian Process Experts , 2005, NIPS.

[19]  Mark Mackenzie,et al.  Asymmetric kernel regression , 2004, IEEE Transactions on Neural Networks.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[22]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[23]  Christopher K. I. Williams,et al.  Discovering Hidden Features with Gaussian Processes Regression , 1998, NIPS.

[24]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[25]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[26]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Miguel Lázaro-Gredilla,et al.  Variational Heteroscedastic Gaussian Process Regression , 2011, ICML.

[28]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Trevor Darrell,et al.  What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.

[30]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[31]  Inderjit S. Dhillon,et al.  Metric and Kernel Learning Using a Linear Transformation , 2009, J. Mach. Learn. Res..

[32]  Ming-Hsuan Yang,et al.  Online Sparse Gaussian Process Regression and Its Applications , 2011, IEEE Transactions on Image Processing.

[33]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[34]  Hang Li,et al.  Asymmetric Kernel Learning , 2010 .

[35]  K. Tsuda Support Vector Classi er with Asymmetric Kernel Functions , 1998 .

[36]  Kilian Q. Weinberger,et al.  Metric Learning for Kernel Regression , 2007, AISTATS.

[37]  Neil D. Lawrence,et al.  Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[38]  Neil D. Lawrence,et al.  Overlapping Mixtures of Gaussian Processes for the Data Association Problem , 2011, Pattern Recognit..

[39]  Kian Hsiang Low,et al.  A Unifying Framework of Anytime Sparse Gaussian Process Regression Models with Stochastic Variational Inference for Big Data , 2015, ICML.

[40]  Edwin V. Bonilla,et al.  Fast Allocation of Gaussian Process Experts , 2014, ICML.

[41]  Marc Sebban,et al.  Metric Learning , 2015, Metric Learning.

[42]  Joachim Denzler,et al.  Large-Scale Gaussian Process Classification with Flexible Adaptive Histogram Kernels , 2012, ECCV.

[43]  Wolfram Burgard,et al.  Most likely heteroscedastic Gaussian process regression , 2007, ICML '07.

[44]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[45]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[46]  Arnold W. M. Smeulders,et al.  Large scale Gaussian Process for overlap-based object proposal scoring , 2016, Comput. Vis. Image Underst..

[47]  Zoubin Ghahramani,et al.  Local and global sparse Gaussian process approximations , 2007, AISTATS.

[48]  Cristian Sminchisescu,et al.  Greedy Block Coordinate Descent for Large Scale Gaussian Process Regression , 2008, UAI.

[49]  Miguel Lázaro-Gredilla,et al.  Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression , 2013, NIPS.

[50]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[51]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[52]  Shiliang Sun,et al.  Kernel regression with sparse metric learning , 2013, J. Intell. Fuzzy Syst..

[53]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.