Multi-Task Learning with User Preferences: Gradient Descent with Controlled Ascent in Pareto Optimization

Multi-Task Learning (MTL) is a well established paradigm for jointly learning models for multiple correlated tasks. Often the tasks conflict, requiring trade-offs between them during optimization. In such cases, multi-objective optimization based MTL methods can be used to find one or more Pareto optimal solutions. A common requirement in MTL applications, that cannot be addressed by these methods, is to find a solution satisfying userspecified preferences with respect to task-specific losses. We advance the state-of-the-art by developing the first gradient-based multi-objective MTL algorithm to solve this problem. Our unique approach combines multiple gradient descent with carefully controlled ascent to traverse the Pareto front in a principled manner, which also makes it robust to initialization. The scalability of our algorithm enables its use in large-scale deep networks for MTL. Assuming only differentiability of the task-specific loss functions, we provide theoretical guarantees for convergence. Our experiments show that our algorithm outperforms the best competing methods on benchmark datasets.

[1]  Emily Mower Provost,et al.  Cross-Corpus Acoustic Emotion Recognition with Multi-Task Learning: Seeking Common Ground While Preserving Differences , 2019, IEEE Transactions on Affective Computing.

[2]  Qingfu Zhang,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 RM-MEDA: A Regularity Model-Based Multiobjective Estimation of , 2022 .

[3]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[4]  Vijay S. Pande,et al.  Massively Multitask Networks for Drug Discovery , 2015, ArXiv.

[5]  Yin Tat Lee,et al.  Solving linear programs in the current matrix multiplication time , 2018, STOC.

[6]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[7]  S. Ober-Blöbaum,et al.  Handling high-dimensional problems with multi-objective continuation methods via successive approximation of the tangent space , 2012 .

[8]  C. A. Coello Coello,et al.  Evolutionary multi-objective optimization: a historical view of the field , 2006, IEEE Computational Intelligence Magazine.

[9]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[10]  Lily Rachmawati,et al.  Preference Incorporation in Multi-objective Evolutionary Algorithms: A Survey , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[11]  Marouane Kessentini,et al.  Preference Incorporation in Evolutionary Multiobjective Optimization , 2015 .

[12]  Qiang Yang,et al.  An Overview of Multi-task Learning , 2018 .

[13]  Qingfu Zhang,et al.  Pareto Multi-Task Learning , 2019, NeurIPS.

[14]  Hisao Ishibuchi,et al.  Interactive Multiobjective Optimization: A Review of the State-of-the-Art , 2018, IEEE Access.

[15]  Ruigang Yang,et al.  DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  A. Shamsai,et al.  Multi-objective Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[17]  Asif Mehmood,et al.  SoftAdapt: Techniques for Adaptive Loss Weighting of Neural Networks with Multi-Part Loss Functions , 2019, ArXiv.

[18]  C. Hillermeier Nonlinear Multiobjective Optimization: A Generalized Homotopy Approach , 2001 .

[19]  Kaisa Miettinen,et al.  Nonlinear multiobjective optimization , 1998, International series in operations research and management science.

[20]  Oliver Schütze,et al.  On Continuation Methods for the Numerical Treatment of Multi-Objective Optimization Problems , 2005, Practical Approaches to Multi-Objective Optimization.

[21]  Yong Wang,et al.  A regularity model-based multiobjective estimation of distribution algorithm with reducing redundant cluster operator , 2012, Appl. Soft Comput..

[22]  Xavier Gandibleux,et al.  Multiple Criteria Optimization: State of the Art Annotated Bibliographic Surveys , 2013 .

[23]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[24]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jörg Fliege,et al.  Newton's Method for Multiobjective Optimization , 2009, SIAM J. Optim..

[26]  J. Désidéri Multiple-gradient descent algorithm (MGDA) for multiobjective optimization , 2012 .

[27]  Jörg Fliege,et al.  A Method for Constrained Multiobjective Optimization Based on SQP Techniques , 2016, SIAM J. Optim..

[28]  Grigorios Tsoumakas,et al.  Multi-target regression via input space expansion: treating targets as inputs , 2012, Machine Learning.

[29]  Stanley Zionts,et al.  Multiple Criteria Decision Making and Risk Analysis Using MicroComputers , 1989 .

[30]  Xiaodong Liu,et al.  Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.

[31]  Yu. K. Mashunin,et al.  Vector Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[32]  Bernhard Sendhoff,et al.  A Reference Vector Guided Evolutionary Algorithm for Many-Objective Optimization , 2016, IEEE Transactions on Evolutionary Computation.

[33]  E. Polak,et al.  On Multicriteria Optimization , 1976 .

[34]  Ralph E. Steuer The Tchebycheff Procedure of Interactive Multiple Objective Programming , 1989 .

[35]  Grigorios Tsoumakas,et al.  Multi-label classification of music by emotion , 2011, EURASIP J. Audio Speech Music. Process..

[36]  Jean-Antoine Désidéri,et al.  Comparison between MGDA and PAES for Multi-Objective Optimization , 2011 .

[37]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[38]  Mikhail Posypkin,et al.  A deterministic algorithm for global multi-objective optimization , 2014, Optim. Methods Softw..

[39]  Nikola Milojković,et al.  Multi-Gradient Descent for Multi-Objective Recommender Systems , 2020, ArXiv.

[40]  Kalyanmoy Deb,et al.  Reference point based multi-objective optimization using evolutionary algorithms , 2006, GECCO.

[41]  Zhao Chen,et al.  GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.

[42]  Jörg Fliege,et al.  Steepest descent methods for multicriteria optimization , 2000, Math. Methods Oper. Res..

[43]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[44]  Michael Dellnitz,et al.  Gradient-Based Multiobjective Optimization with Uncertainties , 2016, 1612.03815.

[45]  Jasbir S. Arora,et al.  Survey of multi-objective optimization methods for engineering , 2004 .