The stochastic multi-gradient algorithm for multi-objective optimization and its application to supervised machine learning

Optimization of conflicting functions is of paramount importance in decision making, and real world applications frequently involve data that is uncertain or unknown, resulting in multi-objective optimization (MOO) problems of stochastic type. We study the stochastic multi-gradient (SMG) method, seen as an extension of the classical stochastic gradient method for single-objective optimization. At each iteration of the SMG method, a stochastic multi-gradient direction is calculated by solving a quadratic subproblem, and it is shown that this direction is biased even when all individual gradient estimators are unbiased. We establish rates to compute a point in the Pareto front, of order similar to what is known for stochastic gradient in both convex and strongly convex cases. The analysis handles the bias in the multi-gradient and the unknown a priori weights of the limiting Pareto point. The SMG method is framed into a Pareto-front type algorithm for the computation of the entire Pareto front. The Pareto-front SMG algorithm is capable of robustly determining Pareto fronts for a number of synthetic test problems. One can apply it to any stochastic MOO problem arising from supervised machine learning, and we report results for logistic binary classification where multiple objectives correspond to distinct-sources data groups.

[1]  Luca Oneto,et al.  Fairness in Machine Learning , 2020, INNSBDDL.

[2]  M. Kearns,et al.  Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.

[3]  Bing Liang,et al.  Trust region methods for solving multiobjective optimisation , 2013, Optim. Methods Softw..

[4]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[5]  Fouad Ben Abdelaziz,et al.  Solution approaches for the multiobjective stochastic programming , 2012, Eur. J. Oper. Res..

[6]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[7]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[8]  Jörg Fliege,et al.  Steepest descent methods for multicriteria optimization , 2000, Math. Methods Oper. Res..

[9]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[10]  L. F. Prudente,et al.  Nonlinear Conjugate Gradient Methods for Vector Optimization , 2018, SIAM J. Optim..

[11]  Antoine Soubeyran,et al.  A Trust-Region Method for Unconstrained Multiobjective Problems with Applications in Satisficing Processes , 2014, J. Optim. Theory Appl..

[12]  Francis Bach,et al.  SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.

[13]  Rafael Caballero,et al.  Stochastic approach versus multiobjective approach for obtaining efficient solutions in stochastic multiobjective programming problems , 2002, Eur. J. Oper. Res..

[14]  A. Shapiro Monte Carlo Sampling Methods , 2003 .

[15]  Yacov Y. Haimes,et al.  Integrated System Identification and Optimization , 1973 .

[16]  K. Chung On a Stochastic Approximation Method , 1954 .

[17]  Nathan Srebro,et al.  Learning Non-Discriminatory Predictors , 2017, COLT.

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  J. Désidéri Multiple-gradient descent algorithm (MGDA) for multiobjective optimization , 2012 .

[20]  Benar Fux Svaiter,et al.  A quadratically convergent Newton method for vector optimization , 2014 .

[21]  Suyun Liu,et al.  Accuracy and Fairness Trade-offs in Machine Learning: A Stochastic Multi-Objective Approach , 2020, ArXiv.

[22]  Jean-Antoine Désidéri,et al.  Multiple-gradient Descent Algorithm for Pareto-Front Identification , 2014, Modeling, Simulation and Optimization for Science and Technology.

[23]  J. Sacks Asymptotic Distribution of Stochastic Approximation Procedures , 1958 .

[24]  B. Svaiter,et al.  A steepest descent method for vector optimization , 2005 .

[25]  Ellen H. Fukuda,et al.  A SURVEY ON MULTIOBJECTIVE DESCENT METHODS , 2014 .

[26]  Jean-Antoine Désidéri,et al.  A stochastic multiple gradient descent algorithm , 2018, Eur. J. Oper. Res..

[27]  Ujjwal Maulik,et al.  A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA , 2008, IEEE Transactions on Evolutionary Computation.

[28]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[29]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[30]  Walter J. Gutjahr,et al.  Stochastic multi-objective optimization: a survey on non-scalarizing methods , 2016, Ann. Oper. Res..

[31]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[32]  Sanghamitra Bandyopadhyay,et al.  Multiobjective GAs, quantitative indices, and pattern classification , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  T. L. Saaty,et al.  The computational algorithm for the parametric objective function , 1955 .

[34]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[35]  Luís N. Vicente,et al.  Direct Multisearch for Multiobjective Optimization , 2011, SIAM J. Optim..

[36]  Aaron Roth,et al.  Equal Opportunity in Online Classification with Partial Feedback , 2019, NeurIPS.

[37]  Alexander Shapiro,et al.  The Sample Average Approximation Method for Stochastic Discrete Optimization , 2002, SIAM J. Optim..

[38]  R. F.,et al.  Mathematical Statistics , 1944, Nature.

[39]  Kaisa Miettinen,et al.  Nonlinear multiobjective optimization , 1998, International series in operations research and management science.

[40]  Alfredo N. Iusem,et al.  A Projected Gradient Method for Vector Optimization Problems , 2004, Comput. Optim. Appl..

[41]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[42]  Matthias Ehrgott,et al.  Multicriteria Optimization , 2005 .

[43]  Jörg Fliege,et al.  Newton's Method for Multiobjective Optimization , 2009, SIAM J. Optim..

[44]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[45]  Jörg Fliege,et al.  Complexity of gradient descent for multiobjective optimization , 2018, Optim. Methods Softw..

[46]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[47]  John E. Dennis,et al.  Normal-Boundary Intersection: A New Method for Generating the Pareto Surface in Nonlinear Multicriteria Optimization Problems , 1998, SIAM J. Optim..

[48]  Alfredo N. Iusem,et al.  Proximal Methods in Vector Optimization , 2005, SIAM J. Optim..

[49]  A. M. Geoffrion Proper efficiency and the theory of vector maximization , 1968 .