Multi-Task Predict-then-Optimize

The predict-then-optimize framework arises in a wide variety of applications where the unknown cost coefficients of an optimization problem are first predicted based on contextual features and then used to solve the problem. In this work, we extend the predict-then-optimize framework to a multi-task setting: contextual features must be used to predict cost coefficients of multiple optimization problems, possibly with different feasible regions, simultaneously. For instance, in a vehicle dispatch/routing application, features such as time-of-day, traffic, and weather must be used to predict travel times on the edges of a road network for multiple traveling salesperson problems that span different target locations and multiple s-t shortest path problems with different source-target pairs. We propose a set of methods for this setting, with the most sophisticated one drawing on advances in multi-task deep learning that enable information sharing between tasks for improved learning, particularly in the small-data regime. Our experiments demonstrate that multi-task predict-then-optimize methods provide good tradeoffs in performance among different tasks, particularly with less training data and more tasks.

[1]  Alexander S. Estes,et al.  Smart Predict-then-Optimize for Two-Stage Linear Programs with Side Information , 2023, INFORMS Journal on Optimization.

[2]  Jasper C. H. Lee,et al.  Predict+Optimize for Packing and Covering LPs with Unknown Parameters in Constraints , 2022, AAAI.

[3]  Axel Parmentier,et al.  Learning with Combinatorial Optimization Layers: a Probabilistic Approach , 2022, ArXiv.

[4]  Elias Boutros Khalil,et al.  PyEPO: A PyTorch-based End-to-End Predict-then-Optimize Library for Linear and Integer Programming , 2022, ArXiv.

[5]  Peter Stone,et al.  Conflict-Averse Gradient Descent for Multi-task Learning , 2021, NeurIPS.

[6]  Michelangelo Diligenti,et al.  Contrastive Losses and Solution Caching for Predict-and-Optimize , 2020, IJCAI.

[7]  Tias Guns,et al.  Interior Point Solving for LP-based prediction+optimisation , 2020, NeurIPS.

[8]  Yulia Tsvetkov,et al.  Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models , 2020, ICLR.

[9]  Junning Liu,et al.  Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations , 2020, RecSys.

[10]  Francis Bach,et al.  Learning with Differentiable Perturbed Optimizers , 2020, ArXiv.

[11]  S. Levine,et al.  Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.

[12]  G. Martius,et al.  Differentiation of Blackbox Combinatorial Solvers , 2019, ICLR.

[13]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[14]  Tias Guns,et al.  Smart Predict-and-Optimize for Hard Combinatorial Optimization Problems , 2019, AAAI.

[15]  Stephen P. Boyd,et al.  Differentiable Convex Optimization Layers , 2019, NeurIPS.

[16]  Milind Tambe,et al.  MIPaaL: Mixed Integer Program as a Layer , 2019, AAAI.

[17]  Milind Tambe,et al.  Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization , 2018, AAAI.

[18]  Zhe Zhao,et al.  Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts , 2018, KDD.

[19]  Andrew J. Davison,et al.  End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Zhao Chen,et al.  GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.

[21]  Adam N. Elmachtoub,et al.  Smart "Predict, then Optimize" , 2017, Manag. Sci..

[22]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Priya L. Donti,et al.  Task-based End-to-end Model Learning in Stochastic Optimization , 2017, NIPS.

[24]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[25]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[26]  Milind Tambe,et al.  Beware the Soothsayer: From Attack Prediction Accuracy to Predictive Reliability in Security Games , 2015, GameSec.

[27]  Trevor Cohn,et al.  Low Resource Dependency Parsing: Cross-lingual Parameter Sharing in a Neural Network Parser , 2015, ACL.

[28]  Ravindra K. Ahuja,et al.  Inverse Optimization , 2001, Oper. Res..

[29]  Yoshua Bengio �Microcell Labs inc. , 2022 .

[30]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[31]  William J. Cook,et al.  Solution of a Large-Scale Traveling-Salesman Problem , 1954, 50 Years of Integer Programming.

[32]  A. Perrault,et al.  Decision-Focused Learning without Decision-Making: Learning Locally Optimized Decision Losses , 2022, NeurIPS.

[33]  Chengqing Zong,et al.  Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) , 2015, IJCNLP 2015.