Optimizing Evaluation Metrics for Multitask Learning via the Alternating Direction Method of Multipliers

Multitask learning (MTL) aims to improve the generalization performance of multiple tasks by exploiting the shared factors among them. Various metrics (e.g., $F$ -score, area under the ROC curve) are used to evaluate the performances of MTL methods. Most existing MTL methods try to minimize either the misclassified errors for classification or the mean squared errors for regression. In this paper, we propose a method to directly optimize the evaluation metrics for a large family of MTL problems. The formulation of MTL that directly optimizes evaluation metrics is the combination of two parts: 1) a regularizer defined on the weight matrix over all tasks, in order to capture the relatedness of these tasks and 2) a sum of multiple structured hinge losses, each corresponding to a surrogate of some evaluation metric on one task. This formulation is challenging in optimization because both of its parts are nonsmooth. To tackle this issue, we propose a novel optimization procedure based on the alternating direction scheme of multipliers, where we decompose the whole optimization problem into a subproblem corresponding to the regularizer and another subproblem corresponding to the structured hinge losses. For a large family of MTL problems, the first subproblem has closed-form solutions. To solve the second subproblem, we propose an efficient primal-dual algorithm via coordinate ascent. Extensive evaluation results demonstrate that, in a large family of MTL problems, the proposed MTL method of directly optimization evaluation metrics has superior performance gains against the corresponding baseline methods.

[1]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[2]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[3]  Yuting Su,et al.  Multiple/Single-View Human Action Recognition via Part-Induced Multitask Structural Learning , 2015, IEEE Transactions on Cybernetics.

[4]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[5]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[6]  Jieping Ye,et al.  Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks , 2010, TKDD.

[7]  Gunnar Rätsch,et al.  Hierarchical Multitask Structured Output Learning for Large-scale Sequence Segmentation , 2011, NIPS.

[8]  Y. Rui,et al.  Learning to Rank Using User Clicks and Visual Features for Image Retrieval , 2015, IEEE Transactions on Cybernetics.

[9]  Paul W. H. Chung,et al.  A Heuristic Distributed Task Allocation Method for Multivehicle Multitask Problems and Its Application to Search and Rescue Scenario , 2016, IEEE Transactions on Cybernetics.

[10]  Yang Yang,et al.  Multitask Spectral Clustering by Exploring Intertask Correlation , 2015, IEEE Transactions on Cybernetics.

[11]  Grigorios Tsoumakas,et al.  MULAN: A Java Library for Multi-Label Learning , 2011, J. Mach. Learn. Res..

[12]  Zhaohong Deng,et al.  Multitask TSK Fuzzy System Modeling by Mining Intertask Common Hidden Structure , 2015, IEEE Transactions on Cybernetics.

[13]  De Xu,et al.  Online State-Based Structured SVM Combined With Incremental PCA for Robust Visual Tracking , 2015, IEEE Transactions on Cybernetics.

[14]  Jiayu Zhou,et al.  Integrating low-rank and group-sparse structures for robust multi-task learning , 2011, KDD.

[15]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[16]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[17]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[18]  Alexander J. Smola,et al.  Multitask Learning without Label Correspondences , 2010, NIPS.

[19]  Eric P. Xing,et al.  Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.

[20]  Adrian S. Lewis,et al.  Convex Analysis And Nonlinear Optimization , 2000 .

[21]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[22]  Grigorios Tsoumakas,et al.  Random K-labelsets for Multilabel Classification , 2022 .

[23]  Yiming Yang,et al.  Learning Multiple Related Tasks using Latent Independent Component Analysis , 2005, NIPS.

[24]  Hanjiang Lai,et al.  A Divide-and-Conquer Method for Scalable Low-Rank Latent Matrix Pursuit , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  James T. Kwok,et al.  Efficient Multi-label Classification with Many Labels , 2013, ICML.

[26]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[27]  Jian Yin,et al.  A Divide-and-Conquer Method for Scalable Robust Multitask Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Murat Dundar,et al.  An Improved Multi-task Learning Approach with Applications in Medical Diagnosis , 2008, ECML/PKDD.

[29]  Kristen Grauman,et al.  Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.

[30]  Sebastián Ventura,et al.  A Tutorial on Multilabel Learning , 2015, ACM Comput. Surv..

[31]  Jiayu Zhou,et al.  Clustered Multi-Task Learning Via Alternating Structure Optimization , 2011, NIPS.

[32]  Bingsheng He,et al.  On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..

[33]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[34]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[35]  Korris Fu-Lai Chung,et al.  Multitask Coupled Logistic Regression and its Fast Implementation for Large Multitask Datasets , 2015, IEEE Transactions on Cybernetics.

[36]  Lei Du,et al.  Robust Multi-View Spectral Clustering via Low-Rank and Sparse Decomposition , 2014, AAAI.

[37]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[38]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[39]  Xuelong Li,et al.  Semi-Supervised Multitask Learning for Scene Recognition , 2015, IEEE Transactions on Cybernetics.

[40]  Yoram Singer,et al.  On the equivalence of weak learnability and linear separability: new relaxations and efficient boosting algorithms , 2010, Machine Learning.

[41]  Jieping Ye,et al.  Robust multi-task feature learning , 2012, KDD.

[42]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[43]  Ryan M. Rifkin,et al.  Value Regularization and Fenchel Duality , 2007, J. Mach. Learn. Res..

[44]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[45]  Anton Schwaighofer,et al.  Learning Gaussian processes from multiple tasks , 2005, ICML.

[46]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[47]  Grigorios Tsoumakas,et al.  Effective and Efficient Multilabel Classification in Domains with Large Number of Labels , 2008 .

[48]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[49]  Yong Tang,et al.  FSMRank: Feature Selection Algorithm for Learning to Rank , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[50]  Yong Tang,et al.  Rank Aggregation via Low-Rank and Structured-Sparse Decomposition , 2013, AAAI.

[51]  Jie Wu,et al.  Sparse Learning-to-Rank via an Efficient Primal-Dual Algorithm , 2013, IEEE Transactions on Computers.

[52]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[53]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.