论文信息 - 3 Mixture of Factor Analyzers based Generative Model for Multitask Learning

3 Mixture of Factor Analyzers based Generative Model for Multitask Learning

We propose a general framework for learning multiple related learning tasks. The proposed unified framework can capture various types of latent structures underlying the weight vectors of multiple tasks. In doing so, our model can automatically interpolate to the appropriate task relatedness assumption as warranted by a given dataset. For instance, the model can capture multitask learning notions such as a shared Gaussian prior in the parameter space, task clustering, lowrank assumption, etc. as special cases, or adapt itself to a more general combination of these assumptions, addressing their individual shortcomings. Our model, therefore, brings in considerable flexibility as compared to these commonly used multitask learning models that are based on some a priori fixed notion of task relatedness. We also present an efficient inference algorithm for this model. Experimental results on several real-world datasets, on both regression and classification problems, establish the efficacy of the proposed method. νθt123 2 1μνθt

Alexandre Passos | Hal Daumé | Piyush Rai

[1] T. Ferguson. A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[2] Michael I. Jordan,et al. A Variational Approach to Bayesian Logistic Regression Models and their Extensions , 1997, AISTATS.

[3] Jorge Nocedal,et al. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[4] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[5] Zoubin Ghahramani,et al. Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[6] Lancelot F. James,et al. Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[7] Yoshua Bengio,et al. Bias learning, knowledge sharing , 2003, IEEE Trans. Neural Networks.

[8] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[9] Alex Acero,et al. Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[10] Yiming Yang,et al. Learning Multiple Related Tasks using Latent Independent Component Analysis , 2005, NIPS.

[11] Thomas L. Griffiths,et al. Infinite latent feature models and the Indian buffet process , 2005, NIPS.