论文信息 - Semi-Supervised Multitask Learning

Semi-Supervised Multitask Learning

A semi-supervised multitask learning (MTL) framework is presented, in which M parameterized semi-supervised classifiers, each associated with one of M partially labeled data manifolds, are learned jointly under the constraint of a soft-sharing prior imposed over the parameters of the classifiers. The unlabeled data are utilized by basing classifier learning on neighborhoods, induced by a Markov random walk over a graph representation of each manifold. Experimental results on real data sets demonstrate that semi-supervised MTL yields significant improvements in generalization performance over either semi-supervised single-task learning (STL) or supervised MTL.

Lawrence Carin | Xuejun Liao | Qiuhua Liu

[1] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[2] Tong Zhang,et al. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[3] Wei-Ying Ma,et al. Collaborative Ensemble Learning: Combining Collaborative and Content-Based Information Filtering via Hierarchical Bayes , 2002, UAI.

[4] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[5] Alexander Zien,et al. Semi-Supervised Learning , 2006 .

[6] David A. Cohn,et al. Active Learning with Statistical Models , 1996, NIPS.

[7] D. Burr,et al. A Bayesian Semiparametric Model for Random-Effects Meta-Analysis , 2005 .

[8] Tom Heskes,et al. Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[9] P. Müller,et al. A method for combining inference across related nonparametric Bayesian models , 2004 .

[10] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[11] Jonathan Baxter,et al. Learning internal representations , 1995, COLT '95.

[12] Nicholas I. M. Gould,et al. An Introduction to Algorithms for Nonlinear Optimization , 2003 .

[13] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .

[14] Sebastian Thrun,et al. Discovering Structure in Multiple Learning Tasks: The TC Algorithm , 1996, ICML.

[15] Lawrence Carin,et al. Detection of buried targets via active selection of labeled data: application to sensing subsurface UXO , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[16] B. Mallick,et al. Combining information from several experiments with nonparametric priors , 1997 .

[17] R. Wolpert,et al. Combining Information From Related Regressions , 1997 .

[18] Volker Tresp,et al. A nonparametric hierarchical bayesian framework for information filtering , 2004, SIGIR '04.

[19] Yiming Yang,et al. Learning Multiple Related Tasks using Latent Independent Component Analysis , 2005, NIPS.

[20] Tommi S. Jaakkola,et al. Partially labeled classification with Markov random walks , 2001, NIPS.

[21] Anton Schwaighofer,et al. Learning Gaussian processes from multiple tasks , 2005, ICML.

[22] Michael E. Tipping. The Relevance Vector Machine , 1999, NIPS.

[23] Sebastian Thrun,et al. Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[24] Mikhail Belkin,et al. Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[25] D. Blackwell,et al. Ferguson Distributions Via Polya Urn Schemes , 1973 .

[26] Lawrence Carin,et al. Semi-Supervised Classification , 2004, Encyclopedia of Database Systems.

[27] Lawrence Carin,et al. Learning Classifiers on a Partially Labeled Data Manifold , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[28] Neil D. Lawrence,et al. Learning to learn with the informative vector machine , 2004, ICML.

[29] T. Ferguson. A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[30] G. Glass. Primary, Secondary, and Meta-Analysis of Research1 , 1976 .

[31] Alexander J. Smola,et al. Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[32] Geoffrey E. Hinton,et al. Learning and relearning in Boltzmann machines , 1986 .

[33] Lawrence Carin,et al. Multi-task learning for underwater object classification , 2007, SPIE Defense + Commercial Sensing.

[34] Mehryar Mohri,et al. AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[35] Charles A. Micchelli,et al. Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[36] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[37] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[38] S. Ganesalingam. Classification and Mixture Approaches to Clustering Via Maximum Likelihood , 1989 .

[39] A. Gelfand,et al. Dirichlet Process Mixed Generalized Linear Models , 1997 .

[40] Zoubin Ghahramani,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[41] Peter D. Hoff,et al. Nonparametric Modeling of Hierarchically Exchangeable Data , 2003 .

[42] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[43] Lawrence Carin,et al. Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[44] Massimiliano Pontil,et al. Regularized multi--task learning , 2004, KDD.