Contrastive Losses and Solution Caching for Predict-and-Optimize

Many decision-making processes involve solving a combinatorial optimization problem with uncertain input that can be estimated from historic data. Recently, problems in this class have been successfully addressed via end-to-end learning approaches, which rely on solving one optimization problem for each training instance at every epoch. In this context, we provide two distinct contributions. First, we use a Noise Contrastive approach to motivate a family of surrogate loss functions, based on viewing non-optimal solutions as negative examples. Second, we address a major bottleneck of all predict-and-optimize approaches, i.e. the need to frequently recompute optimal solutions at training time. This is done via a solver-agnostic solution caching scheme, and by replacing optimization calls with a lookup in the solution cache. The method is formally based on an inner approximation of the feasible space and, combined with a cache lookup strategy, provides a controllable trade-off between training time and accuracy of the loss approximation. We empirically show that even a very slow growth rate is enough to match the quality of state-of-the-art methods, at a fraction of the computational cost.

[1]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[2]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[3]  Marco Gori,et al.  Integrating Learning and Reasoning with Deep Logic Models , 2019, ECML/PKDD.

[4]  Georg Martius,et al.  Differentiation of Blackbox Combinatorial Solvers , 2020, ICLR.

[5]  Adam N. Elmachtoub,et al.  Decision Trees for Decision-Making under the Predict-then-Optimize Framework , 2020, ICML.

[6]  Krysia Broda,et al.  Neural-symbolic learning systems - foundations and applications , 2012, Perspectives in neural computing.

[7]  Milind Tambe,et al.  MIPaaL: Mixed Integer Program as a Layer , 2019, AAAI.

[8]  Milind Tambe,et al.  Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization , 2018, AAAI.

[9]  Tias Guns,et al.  Interior Point Solving for LP-based prediction+optimisation , 2020, NeurIPS.

[10]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[11]  Toby Walsh,et al.  CSPLIB: A Benchmark Library for Constraints , 1999, CP.

[12]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[13]  James Bailey,et al.  An Investigation into Prediction + Optimisation for the Knapsack Problem , 2019, CPAIOR.

[14]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[15]  David Pisinger,et al.  Where are the hard knapsack problems? , 2005, Comput. Oper. Res..

[16]  Tias Guns,et al.  Smart Predict-and-Optimize for Hard Combinatorial Optimization Problems , 2019, AAAI.

[17]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[18]  James Bailey,et al.  Predict+Optimise with Ranking Objectives: Exhaustively Learning Linear Functions , 2019, IJCAI.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Nick Cercone,et al.  Computational Linguistics , 1986, Communications in Computer and Information Science.

[21]  GetoorLise,et al.  Hinge-loss Markov random fields and probabilistic soft logic , 2017 .

[22]  Ian J. Goodfellow,et al.  On distinguishability criteria for estimating generative models , 2014, ICLR.

[23]  Marco Maggini,et al.  Relational Neural Machines , 2020, ECAI.

[24]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.