Learning from Relevant Tasks Only

We introduce a problem called relevant subtask learning, a variant of multi-task learning. The goal is to build a classifier for a task-of-interest having too little data. We also have data for other tasks but only some are relevant, meaning they contain samples classified in the same way as in the task-of-interest. The problem is how to utilize this "background data" to improve the classifier in the task-of-interest. We show how to solve the problem for logistic regression classifiers, and show that the solution works better than a comparable multi-task learning model. The key is to assume that data of all tasks are mixtures of relevant and irrelevant samples, and model the irrelevant part with a sufficiently flexible model such that it does not distort the model of relevant data.

[1]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[2]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[3]  Rajat Raina,et al.  Constructing informative priors using transfer learning , 2006, ICML.

[4]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[5]  Rich Caruana,et al.  Inductive Transfer for Bayesian Network Structure Learning , 2007, ICML Unsupervised and Transfer Learning.

[6]  Anton Schwaighofer,et al.  Learning Gaussian processes from multiple tasks , 2005, ICML.

[7]  Thomas G. Dietterich,et al.  Improving SVM accuracy by training on auxiliary data sources , 2004, ICML.

[8]  Rajat Raina,et al.  Abstract , 1997, Veterinary Record.

[9]  Raymond J. Mooney,et al.  Content-boosted collaborative filtering for improved recommendations , 2002, AAAI/IAAI.

[10]  Thomas Hofmann,et al.  Unifying collaborative and content-based filtering , 2004, ICML.

[11]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[12]  Lawrence Carin,et al.  Logistic regression with an auxiliary data source , 2005, ICML.

[13]  Naftali Tishby,et al.  Unsupervised document classification using sequential information maximization , 2002, SIGIR '02.

[14]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[15]  Thomas G. Dietterich,et al.  Transfer Learning with an Ensemble of Background Tasks , 2005, NIPS 2005.

[16]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.