Learning to Multitask

Multitask learning has shown promising performance in many applications and many multitask models have been proposed. In order to identify an effective multitask model for a given multitask problem, we propose a learning framework called learning to multitask (L2MT). To achieve the goal, L2MT exploits historical multitask experience which is organized as a training set consists of several tuples, each of which contains a multitask problem with multiple tasks, a multitask model, and the relative test error. Based on such training set, L2MT first uses a proposed layerwise graph neural network to learn task embeddings for all the tasks in a multitask problem and then learns an estimation function to estimate the relative test error based on task embeddings and the representation of the multitask model based on a unified formulation. Given a new multitask problem, the estimation function is used to identify a suitable multitask model. Experiments on benchmark datasets show the effectiveness of the proposed L2MT framework.

[1]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[2]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[3]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[4]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[5]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[6]  Charles A. Micchelli,et al.  Learning the Kernel Function via Regularization , 2005, J. Mach. Learn. Res..

[7]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[8]  Alexandre d'Aspremont,et al.  Convex optimization techniques for fitting sparse Gaussian graphical models , 2006, ICML.

[9]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[10]  Charles A. Micchelli,et al.  A Spectral Regularization Framework for Multi-Task Structure Learning , 2007, NIPS.

[11]  Jean-Philippe Vert,et al.  Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.

[12]  Jieping Ye,et al.  A convex formulation for learning shared structures from multiple tasks , 2009, ICML '09.

[13]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[14]  Qian Xu,et al.  Probabilistic Multi-Task Feature Selection , 2010, NIPS.

[15]  Paul Tseng,et al.  Trace Norm Regularization: Reformulations, Algorithms, and Multi-Task Learning , 2010, SIAM J. Optim..

[16]  Jeff G. Schneider,et al.  Learning Multiple Tasks with a Sparse Matrix-Normal Penalty , 2010, NIPS.

[17]  Ali Jalali,et al.  A Dirty Model for Multi-task Learning , 2010, NIPS.

[18]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Dit-Yan Yeung,et al.  A Convex Formulation for Learning Task Relationships in Multi-Task Learning , 2010, UAI.

[20]  Dit-Yan Yeung,et al.  Multi-Task Learning using Generalized t Process , 2010, AISTATS.

[21]  Jieping Ye,et al.  Feature grouping and selection over an undirected graph , 2012, KDD.

[22]  Hal Daumé,et al.  Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression , 2012, NIPS.

[23]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[24]  Jieping Ye,et al.  Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks , 2010, TKDD.

[25]  Yu Zhang Heterogeneous-Neighborhood-based Multi-Task Local Learning Algorithms , 2013, NIPS.

[26]  Dit-Yan Yeung,et al.  Learning High-Order Task Relationships in Multi-Task Learning , 2013, IJCAI.

[27]  Daphna Weinshall,et al.  Hierarchical Regularization Cascade for Joint Learning , 2013, ICML.

[28]  Dit-Yan Yeung,et al.  A Regularization Approach to Learning Task Relationships in Multitask Learning , 2014, ACM Trans. Knowl. Discov. Data.

[29]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[30]  Lei Han,et al.  Learning Tree Structure in Multi-Task Learning , 2015, KDD.

[31]  Lei Han,et al.  Learning Multi-Level Task Groups in Multi-Task Learning , 2015, AAAI.

[32]  Jürgen Schmidhuber,et al.  Highway Networks , 2015, ArXiv.

[33]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[34]  Donald F. Towsley,et al.  Diffusion-Convolutional Neural Networks , 2015, NIPS.

[35]  Bing Liu,et al.  Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.

[36]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[37]  Andreas Maurer A chain rule for the expected suprema of Gaussian processes , 2016, Theor. Comput. Sci..

[38]  Massimiliano Pontil,et al.  The Benefit of Multitask Representation Learning , 2015, J. Mach. Learn. Res..

[39]  Lei Han,et al.  Multi-Stage Multi-Task Learning with Reduced Rank , 2016, AAAI.

[40]  Martial Hebert,et al.  Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Eunho Yang,et al.  Asymmetric multi-task learning based on task relatedness and loss , 2016, ICML 2016.

[42]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[43]  Yu Zhang,et al.  Learning Sparse Task Relations in Multi-Task Learning , 2017, AAAI.

[44]  Yu Zhang,et al.  Transfer Learning via Learning to Transfer , 2018, ICML.