Conditional multi-output regression

In multi-output regression, the goal is to establish a mapping from inputs to multivariate outputs that are often assumed unknown. However, in practice, some outputs may become available. How can we use this extra information to improve our prediction on the remaining outputs? For example, can we use the job data released today to better predict the house sales data to be released tomorrow? Most previous approaches use a single generative model to model the joint predictive distribution of all outputs, based on which unknown outputs are inferred conditionally from the known outputs. However, learning such a joint distribution for all outputs is very challenging and also unnecessary if our goal is just to predict each of the unknown outputs. We propose a conditional model to directly model the conditional probability of a target output on both inputs and all other outputs. A simple generative model is used to infer other outputs if they are unknown. Both models only consist of standard regression predictors, for example, Gaussian process, which can be easily learned.

[1]  José M. Matías Multi-output Nonparametric Regression , 2005, EPIA.

[2]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[3]  E. Walter,et al.  Multi-Output Suppport Vector Regression , 2003 .

[4]  Nir Friedman,et al.  Gaussian Process Networks , 2000, UAI.

[5]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[6]  Hao Zhang,et al.  Maximum‐likelihood estimation for multivariate spatial linear coregionalization models , 2007 .

[7]  P. Goovaerts Ordinary Cokriging Revisited , 1998 .

[8]  Meila,et al.  Kernel multitask learning using task-specific features , 2007 .

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10]  Ashutosh Saxena,et al.  Cascaded Classification Models: Combining Models for Holistic Scene Understanding , 2008, NIPS.

[11]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[12]  Agathe Girard,et al.  Prediction at an Uncertain Input for Gaussian Processes and Relevance Vector Machines Application to Multiple-Step Ahead Time-Series Forecasting , 2002 .

[13]  Neil D. Lawrence,et al.  Sparse Convolved Gaussian Processes for Multi-output Regression , 2008, NIPS.

[14]  Anton Schwaighofer,et al.  Learning Gaussian processes from multiple tasks , 2005, ICML.

[15]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[16]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[17]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[18]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[19]  Neil D. Lawrence,et al.  Learning to learn with the informative vector machine , 2004, ICML.

[20]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[21]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[22]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[23]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[24]  Wei Chu,et al.  Bayesian support vector regression using a unified loss function , 2004, IEEE Transactions on Neural Networks.

[25]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[26]  Anton Schwaighofer,et al.  Learning Gaussian Process Kernels via Hierarchical Bayes , 2004, NIPS.

[27]  Yee Whye Teh,et al.  Semiparametric latent factor models , 2005, AISTATS.

[28]  Edwin V. Bonilla,et al.  Kernel Multi-task Learning using Task-specific Features , 2007, AISTATS.