3 Mixture of Factor Analyzers based Generative Model for Multitask Learning

We propose a general framework for learning multiple related learning tasks. The proposed unified framework can capture various types of latent structures underlying the weight vectors of multiple tasks. In doing so, our model can automatically interpolate to the appropriate task relatedness assumption as warranted by a given dataset. For instance, the model can capture multitask learning notions such as a shared Gaussian prior in the parameter space, task clustering, lowrank assumption, etc. as special cases, or adapt itself to a more general combination of these assumptions, addressing their individual shortcomings. Our model, therefore, brings in considerable flexibility as compared to these commonly used multitask learning models that are based on some a priori fixed notion of task relatedness. We also present an efficient inference algorithm for this model. Experimental results on several real-world datasets, on both regression and classification problems, establish the efficacy of the proposed method. νθt123 2 1μνθt

[1]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[2]  Michael I. Jordan,et al.  A Variational Approach to Bayesian Logistic Regression Models and their Extensions , 1997, AISTATS.

[3]  Jorge Nocedal,et al.  Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[4]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[5]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[6]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[7]  Yoshua Bengio,et al.  Bias learning, knowledge sharing , 2003, IEEE Trans. Neural Networks.

[8]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[9]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[10]  Yiming Yang,et al.  Learning Multiple Related Tasks using Latent Independent Component Analysis , 2005, NIPS.

[11]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[12]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[13]  Rajat Raina,et al.  Constructing informative priors using transfer learning , 2006, ICML.

[14]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[15]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[16]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[17]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[18]  Michael I. Jordan,et al.  Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[19]  Zoubin Ghahramani,et al.  Infinite Sparse Factor Analysis and Infinite Independent Components Analysis , 2007, ICA.

[20]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[21]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[22]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[23]  Massimiliano Pontil,et al.  An Algorithm for Transfer Learning in a Heterogeneous Environment , 2008, ECML/PKDD.

[24]  Jean-Philippe Vert,et al.  Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.

[25]  Yee Whye Teh,et al.  Variational Inference for the Indian Buffet Process , 2009, AISTATS.

[26]  Lawrence Carin,et al.  Nonparametric factor analysis with beta process priors , 2009, ICML '09.

[27]  Hal Daumé,et al.  Bayesian Multitask Learning with Latent Hierarchies , 2009, UAI.

[28]  Hal Daumé,et al.  Learning Multiple Tasks using Manifold Regularization , 2010, NIPS.

[29]  Thomas L. Griffiths,et al.  Modeling Transfer Learning in Human Categorization with the Hierarchical Dirichlet Process , 2010, ICML.

[30]  David B. Dunson,et al.  Compressive Sensing on Manifolds Using a Nonparametric Mixture of Factor Analyzers: Algorithm and Performance Bounds , 2010, IEEE Transactions on Signal Processing.

[31]  Hal Daumé,et al.  Infinite Predictor Subspace Models for Multitask Learning , 2010, AISTATS.