Sparse Convolved Multiple Output G aussian Processes

Recently there has been an increasing interest in methods that deal with multiple outputs. This has been motivated partly by frameworks like multitask learning, multisensor networks or structured output data. From a Gaussian processes perspective, the problem reduces to specifying an appropriate covariance function that, whilst being positive semi-definite, captures the dependencies between all the data points and across all the outputs. One approach to account for non-trivial correlations between outputs employs convolution processes. Under a latent function interpretation of the convolution transform we establish dependencies between output variables. The main drawbacks of this approach are the associated computational and storage demands. In this paper we address these issues. We present different sparse approximations for dependent output Gaussian processes constructed through the convolution formalism. We exploit the conditional independencies present naturally in the model. This leads to a form of the covariance similar in spirit to the so called PITC and FITC approximations for a single output. We show experimental results with synthetic and real data, in particular, we show results in pollution prediction, school exams score prediction and gene expression data.

[1]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[2]  Neil D. Lawrence,et al.  Probabilistic inference of transcription factor concentrations and gene-specific regulatory activities , 2006, Bioinform..

[3]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[4]  Neil D. Lawrence,et al.  Learning for Larger Datasets with the Gaussian Process Latent Variable Model , 2007, AISTATS.

[5]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[6]  Sayan Mukherjee,et al.  Characterizing the Function Space for Bayesian Kernel Models , 2007, J. Mach. Learn. Res..

[7]  Neil D. Lawrence,et al.  Modelling transcriptional regulation using Gaussian Processes , 2006, NIPS.

[8]  R. Olea Geostatistics for Natural Resources Evaluation By Pierre Goovaerts, Oxford University Press, Applied Geostatistics Series, 1997, 483 p., hardcover, $65 (U.S.), ISBN 0-19-511538-4 , 1999 .

[9]  Zoubin Ghahramani,et al.  Local and global sparse Gaussian process approximations , 2007, AISTATS.

[10]  Marc G. Genton,et al.  Classes of Kernels for Machine Learning: A Statistics Perspective , 2002, J. Mach. Learn. Res..

[11]  Tom Heskes,et al.  Empirical Bayes for Learning to Learn , 2000, ICML.

[12]  David Higdon,et al.  Non-Stationary Spatial Modeling , 2022, 2212.08043.

[13]  Marcus R. Frean,et al.  Dependent Gaussian Processes , 2004, NIPS.

[14]  Mark J. Schervish,et al.  Nonstationary Covariance Functions for Gaussian Process Regression , 2003, NIPS.

[15]  M. Fuentes Spectral methods for nonstationary spatial processes , 2002 .

[16]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[17]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[18]  M. Fuentes Interpolation of nonstationary air pollution processes: a spatial spectral approach , 2002 .

[19]  Christopher K. Wikle,et al.  Hierarchical Bayesian Models for Predicting The Spread of Ecological Processes , 2003 .

[20]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[21]  D. Higdon Space and Space-Time Modeling using Process Convolutions , 2002 .

[22]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[23]  David Higdon,et al.  A process-convolution approach to modelling temperatures in the North Atlantic Ocean , 1998, Environmental and Ecological Statistics.

[24]  Ronald P. Barry,et al.  Constructing and fitting models for cokriging and multivariable spatial prediction , 1998 .

[25]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[26]  N. Cressie,et al.  Universal cokriging under intrinsic coregionalization , 1994 .

[27]  Neil D. Lawrence,et al.  Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[28]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[29]  Ronald P. Barry,et al.  Blackbox Kriging: Spatial Prediction Without Specifying Variogram Models , 1996 .

[30]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[31]  Roger Brent,et al.  Yeast Cbk1 and Mob2 Activate Daughter-Specific Genetic Programs to Induce Asymmetric Cell Fates , 2001, Cell.

[32]  Neil D. Lawrence,et al.  Sparse Convolved Gaussian Processes for Multi-output Regression , 2008, NIPS.

[33]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[34]  C. Wikle A kernel-based spectral model for non-Gaussian spatio-temporal processes , 2002 .

[35]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[36]  Neil D. Lawrence,et al.  Gaussian process modelling of latent chemical species: applications to inferring transcription factor activities , 2008, ECCB.

[37]  Stephen J. Roberts,et al.  Gaussian Processes for Prediction , 2007 .

[38]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[39]  L. M. Berliner,et al.  Hierarchical Bayesian space-time models , 1998, Environmental and Ecological Statistics.

[40]  Neil D. Lawrence,et al.  Latent Force Models , 2009, AISTATS.

[41]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[42]  Manfred Opper,et al.  Sparse Representation for Gaussian Process Models , 2000, NIPS.

[43]  Noel A Cressie,et al.  Some topics in convolution-based spatial modeling , 2007 .

[44]  M. Barenco,et al.  Ranked prediction of p53 targets using hidden variable dynamic modeling , 2006, Genome Biology.

[45]  Catherine A. Calder,et al.  Dynamic factor process convolution models for multivariate space–time data with application to air quality assessment , 2007, Environmental and Ecological Statistics.

[46]  Yee Whye Teh,et al.  Semiparametric latent factor models , 2005, AISTATS.

[47]  Sarvapali D. Ramchurn,et al.  2008 International Conference on Information Processing in Sensor Networks Towards Real-Time Information Processing of Sensor Network Data using Computationally Efficient Multi-output Gaussian Processes , 2022 .

[48]  Hans Wackernagel,et al.  Multivariate Geostatistics: An Introduction with Applications , 1996 .

[49]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.