Multi-task Learning

A fundamental limitation of standard machine learning methods is the cost incurred by the preparation of the large training samples required for good generalization. A potential remedy is offered by multi-task learning: in many cases, while individual sample sizes are rather small, there are samples to represent a large number of learning tasks (linear regression problems), which share some constraining or generative property. If this property is sufficiently simple it should allow for better learning of the individual tasks despite their small individual sample sizes. In this talk I will review a wide class of multi-task learning methods which encourage low-dimensional representations of the regression vectors. I will describe techniques to solve the underlying optimization problems and present an analysis of the generalization performance of these learning methods which provides a proof of the superiority of multi-task learning under specific conditions. ROKS 2013

[1]  Massimiliano Pontil,et al.  An Algorithm for Transfer Learning in a Heterogeneous Environment , 2008, ECML/PKDD.

[2]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[3]  Massimiliano Pontil,et al.  $K$ -Dimensional Coding Schemes in Hilbert Spaces , 2010, IEEE Transactions on Information Theory.

[4]  Johan A. K. Suykens,et al.  Tensor Versus Matrix Completion: A Comparison With Application to Spectral Data , 2011, IEEE Signal Processing Letters.

[5]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ambuj Tewari,et al.  Regularization Techniques for Learning with Matrices , 2009, J. Mach. Learn. Res..

[7]  Stephen P. Boyd,et al.  A rank minimization heuristic with application to minimum order system approximation , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[8]  Massimiliano Pontil,et al.  A New Convex Relaxation for Tensor Completion , 2013, NIPS.

[9]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[10]  S. Geer,et al.  Oracle Inequalities and Optimal Inference under Group Sparsity , 2010, 1007.1771.

[11]  Kristen Grauman,et al.  Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.

[12]  Claudio Gentile,et al.  Linear Algorithms for Online Multitask Classification , 2010, COLT.

[13]  P. Lenk,et al.  Hierarchical Bayes Conjoint Analysis: Recovery of Partworth Heterogeneity from Reduced Experimental Designs , 1996 .

[14]  Ben Taskar,et al.  Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..

[15]  B. Recht,et al.  Tensor completion and low-n-rank tensor recovery via convex optimization , 2011 .

[16]  Jean-Philippe Vert,et al.  Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.

[17]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..