Learning Multiple Tasks using Shared Hypotheses

In this work we consider a setting where we have a very large number of related tasks with few examples from each individual task. Rather than either learning each task individually (and having a large generalization error) or learning all the tasks together using a single hypothesis (and suffering a potentially large inherent error), we consider learning a small pool of shared hypotheses. Each task is then mapped to a single hypothesis in the pool (hard association). We derive VC dimension generalization bounds for our model, based on the number of tasks, shared hypothesis and the VC dimension of the hypotheses class. We conducted experiments with both synthetic problems and sentiment of reviews, which strongly support our approach.

[1]  Neil D. Lawrence,et al.  Learning to learn with the informative vector machine , 2004, ICML.

[2]  Shimon Ullman,et al.  Uncovering shared structures in multiclass classification , 2007, ICML '07.

[3]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[4]  Ben Taskar,et al.  Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..

[5]  Meila,et al.  Kernel multitask learning using task-specific features , 2007 .

[6]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.

[7]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[8]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[9]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[10]  Gilles Blanchard,et al.  Generalizing from Several Related Classification Tasks to a New Unlabeled Sample , 2011, NIPS.

[11]  Hal Daumé,et al.  Bayesian Multitask Learning with Latent Hierarchies , 2009, UAI.

[12]  Volker Tresp,et al.  Robust multi-task learning with t-processes , 2007, ICML '07.

[13]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[14]  Edwin V. Bonilla,et al.  Kernel Multi-task Learning using Task-specific Features , 2007, AISTATS.

[15]  Koby Crammer,et al.  Learning from Multiple Sources , 2006, NIPS.

[16]  Koby Crammer,et al.  Learning from Data of Variable Quality , 2005, NIPS.

[17]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[18]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[19]  Anton Schwaighofer,et al.  Learning Gaussian processes from multiple tasks , 2005, ICML.

[20]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[21]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[22]  Peter L. Bartlett,et al.  Learning in Neural Networks: Theoretical Foundations , 1999 .

[23]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[24]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.