Kernel Multi-task Learning using Task-specific Features

In this paper we are concerned with multitask learning when task-specific features are available. We describe two ways of achieving this using Gaussian process predictors: in the first method, the data from all tasks is combined into one dataset, making use of the task-specific features. In the second method we train specific predictors for each reference task, and then combine their predictions using a gating network. We demonstrate these methods on a compiler performance prediction problem, where a task is defined as predicting the speed-up obtained when applying a sequence of code transformations to a given program.

[1]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[2]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[3]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[4]  Sebastian Thrun,et al.  Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[5]  Monica S. Lam,et al.  Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..

[6]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[7]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[8]  David I. August,et al.  Compiler optimization-space exploration , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[9]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[10]  James E. Smith,et al.  A first-order superscalar processor model , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[11]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[12]  Anton Schwaighofer,et al.  Learning Gaussian processes from multiple tasks , 2005, ICML.

[13]  Risto Lehtonen,et al.  Multilevel Statistical Models , 2005 .

[14]  Michael F. P. O'Boyle,et al.  Predictive search distributions , 2006, ICML '06.

[15]  Wei Chu,et al.  Stochastic Relational Models for Discriminative Link Prediction , 2006, NIPS.

[16]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.