论文信息 - Scalable Pareto Front Approximation for Deep Multi-Objective Learning

Scalable Pareto Front Approximation for Deep Multi-Objective Learning

Multi-objective optimization is important for various Deep Learning applications, however, no prior multi-objective method suits very deep networks. Existing approaches either require training a new network for every solution on the Pareto front or add a considerable overhead to the number of parameters by introducing hyper-networks conditioned on modifiable preferences. In this paper, we present a novel method that contextualizes the network directly on the preferences by adding them to the input space. In addition, we ensure a well-spread Pareto front by forcing the solutions to preserve a small angle to the preference vector. Through extensive experiments, we demonstrate that our Pareto fronts achieve state-of-the-art quality despite being computed significantly faster. Furthermore, we demonstrate the scalability as our method approximates the full Pareto front on the CelebA dataset with an EfficientNet network at a marginal training time overhead of 7% compared to a single-objective optimization. We make the code publicly available at https://github.com/ruchtem/cosmos.

Josif Grabocka | Michael Ruchte | Josif Grabocka | Michael Ruchte

[1] Frank Hutter,et al. Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution , 2018, ICLR.

[2] Gade Pandu Rangaiah,et al. Application and Analysis of Methods for Selecting an Optimal Solution from the Pareto-Optimal Front obtained by Multiobjective Optimization , 2017 .

[3] Andrew J. Davison,et al. End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.

[5] Xiaoou Tang,et al. Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[6] Qingfu Zhang,et al. Controllable Pareto Multi-Task Learning , 2020, ArXiv.

[7] Runzhe Yang,et al. A Generalized Algorithm for Multi-Objective RL and Policy Adaptation , 2019 .

[8] Marcello Restelli,et al. Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation , 2016, J. Artif. Intell. Res..

[9] Alexey Dosovitskiy,et al. You Only Train Once: Loss-Conditional Training of Deep Networks , 2020, ICLR.

[10] Roberto Cipolla,et al. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] I-Cheng Yeh,et al. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients , 2009, Expert Syst. Appl..

[12] E. Polak,et al. On Multicriteria Optimization , 1976 .

[13] Yu Zhang,et al. A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[14] Vaibhav Rajan,et al. Multi-Task Learning with User Preferences: Gradient Descent with Controlled Ascent in Pareto Optimization , 2020, ICML.

[15] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[16] Lothar Thiele,et al. The Hypervolume Indicator Revisited: On the Design of Pareto-compliant Indicators Via Weighted Integration , 2007, EMO.

[17] Qingfu Zhang,et al. Pareto Multi-Task Learning , 2019, NeurIPS.

[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19] Jitendra Malik,et al. Which Tasks Should Be Learned Together in Multi-task Learning? , 2019, ICML.

[20] Yoshua Bengio,et al. Object Recognition with Gradient-Based Learning , 1999, Shape, Contour and Grouping in Computer Vision.

[21] Jean-Antoine Désidéri,et al. MUTIPLE-GRADIENT DESCENT ALGORITHM FOR MULTIOBJECTIVE OPTIMIZATION , 2012 .

[22] Boi Faltings,et al. Addressing Fairness in Classification with a Model-Agnostic Multi-Objective Algorithm , 2020, UAI.

[23] Ann Nowé,et al. Multi-objective reinforcement learning using sets of pareto dominating policies , 2014, J. Mach. Learn. Res..

[24] Aditya Krishna Menon,et al. The cost of fairness in binary classification , 2018, FAT.

[25] Zhao Chen,et al. GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.

[26] Gal Chechik,et al. Learning the Pareto Front with Hypernetworks , 2020, ICLR.

[27] Kalyanmoy Deb,et al. A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.

[28] Jörg Fliege,et al. Steepest descent methods for multicriteria optimization , 2000, Math. Methods Oper. Res..