Probabilistic Low-Rank Multitask Learning

In this paper, we consider the problem of learning multiple related tasks simultaneously with the goal of improving the generalization performance of individual tasks. The key challenge is to effectively exploit the shared information across multiple tasks as well as preserve the discriminative information for each individual task. To address this, we propose a novel probabilistic model for multitask learning (MTL) that can automatically balance between low-rank and sparsity constraints. The former assumes a low-rank structure of the underlying predictive hypothesis space to explicitly capture the relationship of different tasks and the latter learns the incoherent sparse patterns private to each task. We derive and perform inference via variational Bayesian methods. Experimental results on both regression and classification tasks on real-world applications demonstrate the effectiveness of the proposed method in dealing with the MTL problems.

[1]  Jennifer G. Dy,et al.  Sparse Probabilistic Principal Component Analysis , 2009, AISTATS.

[2]  Kilian Q. Weinberger,et al.  Marginalizing stacked linear denoising autoencoders , 2015, J. Mach. Learn. Res..

[3]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[4]  Kristen Grauman,et al.  Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.

[5]  Ning Chen,et al.  Infinite Latent SVM for Classification and Multi-task Learning , 2011, NIPS.

[6]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[7]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[8]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[10]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[11]  Max A. Little,et al.  Accurate Telemonitoring of Parkinson's Disease Progression by Noninvasive Speech Tests , 2009, IEEE Transactions on Biomedical Engineering.

[12]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Massimiliano Pontil,et al.  Sparse coding for multitask and transfer learning , 2012, ICML.

[14]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[15]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[16]  Yun Fu,et al.  Learning Robust and Discriminative Subspace With Low-Rank Constraints , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Jieping Ye,et al.  Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks , 2010, TKDD.

[18]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[19]  Max A. Little,et al.  Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests , 2009 .

[20]  Leon Wenliang Zhong,et al.  Convex Multitask Learning with Flexible Task Clusters , 2012, ICML.

[21]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[22]  Behrooz Mahasseni,et al.  Latent Multitask Learning for View-Invariant Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Ji Zhu,et al.  Reduced rank ridge regression and its kernel extensions , 2011, Stat. Anal. Data Min..

[24]  Dapeng Oliver Wu,et al.  Why Deep Learning Works: A Manifold Disentanglement Perspective , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Kung-Sik Chan,et al.  Reduced rank regression via adaptive nuclear norm penalization. , 2012, Biometrika.

[26]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[27]  Ming Shao,et al.  Incomplete Multisource Transfer Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Lawrence Carin,et al.  Bayesian Robust Principal Component Analysis , 2011, IEEE Transactions on Image Processing.

[29]  Feiping Nie,et al.  Heterogeneous Visual Features Fusion via Sparse Multimodal Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[31]  Qiang Yang,et al.  A Survey of Transfer and Multitask Learning in Bioinformatics , 2011, J. Comput. Sci. Eng..

[32]  Charles A. Micchelli,et al.  Universal Multi-Task Kernels , 2008, J. Mach. Learn. Res..

[33]  Jiayu Zhou,et al.  Integrating low-rank and group-sparse structures for robust multi-task learning , 2011, KDD.

[34]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[35]  Katherine A. Heller,et al.  Evaluating Bayesian and L1 Approaches for Sparse Unsupervised Learning , 2011, ICML.

[36]  Yulong Wang,et al.  Sparse Coding From a Bayesian Perspective , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Anton Schwaighofer,et al.  Learning Gaussian processes from multiple tasks , 2005, ICML.

[38]  Massimiliano Pontil,et al.  Taking Advantage of Sparsity in Multi-Task Learning , 2009, COLT.

[39]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[40]  M. Wegkamp,et al.  Joint variable and rank selection for parsimonious estimation of high-dimensional matrices , 2011, 1110.3556.

[41]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[42]  Junbin Gao,et al.  Robust L1 Principal Component Analysis and Its Bayesian Variational Inference , 2008, Neural Computation.

[43]  Onno Zoeter,et al.  Sparse Bayesian Multi-Task Learning , 2011, NIPS.

[44]  Andreas Maurer,et al.  Bounds for Linear Multi-Task Learning , 2006, J. Mach. Learn. Res..

[45]  Chao Zhang,et al.  Integrated Low-Rank-Based Discriminative Feature Learning for Recognition , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Junbin Gao,et al.  Tensor LRR and Sparse Coding-Based Subspace Clustering , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[47]  Paul Tseng,et al.  Trace Norm Regularization: Reformulations, Algorithms, and Multi-Task Learning , 2010, SIAM J. Optim..

[48]  Mengjie Zhang,et al.  Domain Generalization for Object Recognition with Multi-task Autoencoders , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[49]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[50]  Yun Fu,et al.  Self-Taught Low-Rank Coding for Visual Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[51]  Bernt Schiele,et al.  Scalable Multitask Representation Learning for Scene Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Cong Li,et al.  Pareto-Path Multitask Multiple Kernel Learning , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[53]  Massimiliano Pontil,et al.  Multilinear Multitask Learning , 2013, ICML.

[54]  Kristen Grauman,et al.  Sharing features between objects and their attributes , 2011, CVPR 2011.

[55]  Thibault Helleputte,et al.  Expectation Propagation for Bayesian Multi-task Feature Selection , 2010, ECML/PKDD.

[56]  Joshua B. Tenenbaum,et al.  Learning to share visual appearance for multiclass object detection , 2011, CVPR 2011.