Towards One-shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case

One-shot non-autoregressive neural networks, different from RL-based ones, have been actively adopted for solving combinatorial optimization (CO) problems, which can be trained by the objective score in a self-supervised manner. Such methods have shown their superiority in efficiency (e.g. by parallelization) and potential for tackling predictive CO problems for decision-making under uncertainty. While the discrete constraints often become a bottleneck for gradient-based neural solvers, as currently handled in three typical ways: 1) adding a soft penalty in the objective, where a bounded violation of the constraints cannot be guaranteed, being critical to many constraint-sensitive scenarios; 2) perturbing the input to generate an approximate gradient in a black-box manner, though the constraints are exactly obeyed while the approximate gradients can hurt the performance on the objective score; 3) a compromise by developing soft algorithms whereby the output of neural networks obeys a relaxed constraint, and there can still occur an arbitrary degree of constraint-violation. Towards the ultimate goal of establishing a general framework for neural CO solver with the ability to control an arbitrarysmall degree of constraint violation, in this paper, we focus on a more achievable and common setting: the cardinality constraints, which in fact can be readily encoded by a differentiable optimal transport (OT) layer. Based on this observation, we propose OT-based cardinality constraint encoding for end-to-end CO problem learning with two variants: Sinkhorn and Gumbel-Sinkhorn, whereby their violation of the constraints can be exactly characterized and bounded by our theoretical results. On synthetic and real-world CO problem instances, our methods surpass the state-of-the-art CO network and are comparable to (if not superior to) the commercial solver Gurobi. In particular, we further showcase a case study of applying our approach to the predictive portfolio optimization task on real-world asset price data, improving the Sharpe ratio from 1.1 to 2.0 of a strong LSTM+Gurobi baseline under the classic predict-then-optimize paradigm.

[1]  Junchi Yan,et al.  A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs , 2021, NeurIPS.

[2]  Wei Chen,et al.  Network Inference and Influence Maximization from Samples , 2021, ICML.

[3]  G. Martius,et al.  CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints , 2021, ICML.

[4]  Jie Zhang,et al.  Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning , 2020, NeurIPS.

[5]  Junchi Yan,et al.  Combinatorial Learning of Robust Deep Graph Matching: An Embedding Based Approach , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Andreas Loukas,et al.  Erdos Goes Neural: an Unsupervised Learning Framework for Combinatorial Optimization on Graphs , 2020, NeurIPS.

[7]  Baoxin Li,et al.  Learning deep graph matching with channel-independent embedding and Hungarian attention , 2020, ICLR.

[8]  Hao Lu,et al.  A Learning-based Iterative Method for Solving Vehicle Routing Problems , 2020, ICLR.

[9]  Nils M. Kriege,et al.  Deep Graph Matching Consensus , 2020, ICLR.

[10]  G. Martius,et al.  Differentiation of Blackbox Combinatorial Solvers , 2019, ICLR.

[11]  Junchi Yan,et al.  Neural Graph Matching Network: Learning Lawler’s Quadratic Assignment Problem With Extension to Hypergraph and Multiple-Graph Matching , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Tomasz Malisiewicz,et al.  SuperGlue: Learning Feature Matching With Graph Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Stephen P. Boyd,et al.  Differentiable Convex Optimization Layers , 2019, NeurIPS.

[14]  Rik Sarkar,et al.  Multi-scale Attributed Node Embedding , 2019, J. Complex Networks.

[15]  Jing Liu,et al.  Gumbel-softmax Optimization: A Simple General Framework for Combinatorial Optimization Problems on Graphs , 2019, COMPLEX NETWORKS.

[16]  Andry Alamsyah,et al.  Forecasting Portfolio Optimization using Artificial Neural Network and Genetic Algorithm , 2019, 2019 7th International Conference on Information and Communication Technology (ICoICT).

[17]  Vladlen Koltun,et al.  The Limited Multi-Label Projection Layer , 2019, ArXiv.

[18]  Marco Cuturi,et al.  Differentiable Ranks and Sorting using Optimal Transport , 2019, 1905.11885.

[19]  Priya L. Donti,et al.  SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver , 2019, ICML.

[20]  Oriol Vinyals,et al.  Graph Matching Networks for Learning the Similarity of Graph Structured Objects , 2019, ICML.

[21]  Junchi Yan,et al.  Learning Combinatorial Embedding Networks for Deep Graph Matching , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[23]  S. Ermon,et al.  Stochastic Optimization of Sorting Networks via Continuous Relaxations , 2019, ICLR.

[24]  Xinyun Chen,et al.  Learning to Perform Local Rewriting for Combinatorial Optimization , 2018, NeurIPS.

[25]  Milind Tambe,et al.  Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization , 2018, AAAI.

[26]  Yizhou Sun,et al.  SimGNN: A Neural Network Approach to Fast Graph Similarity Computation , 2018, WSDM.

[27]  Cristian Sminchisescu,et al.  Deep Learning of Graph Matching , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Joan Bruna,et al.  REVISED NOTE ON LEARNING QUADRATIC ASSIGNMENT WITH GRAPH NEURAL NETWORKS , 2018, 2018 IEEE Data Science Workshop (DSW).

[29]  Scott W. Linderman,et al.  Learning Latent Permutations with Gumbel-Sinkhorn Networks , 2018, ICLR.

[30]  Heinrich Müller,et al.  SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Adam N. Elmachtoub,et al.  Smart "Predict, then Optimize" , 2017, Manag. Sci..

[32]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[33]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[34]  Elias Boutros Khalil,et al.  Learning Combinatorial Optimization Algorithms over Graphs , 2017, NIPS.

[35]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[36]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[37]  Tianbao Yang,et al.  Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon) , 2016, NIPS.

[38]  Le Song,et al.  Discriminative Embeddings of Latent Variable Models for Structured Data , 2016, ICML.

[39]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Joseph Naor,et al.  Submodular Maximization with Cardinality Constraints , 2014, SODA.

[42]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[43]  Lin Xiao,et al.  A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem , 2012, SIAM J. Optim..

[44]  Yazid M. Sharaiha,et al.  Heuristics for cardinality constrained portfolio optimisation , 2000, Comput. Oper. Res..

[45]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[46]  W. Sharpe The Sharpe Ratio , 1994 .

[47]  John E. Beasley,et al.  OR-Library: Distributing Test Problems by Electronic Mail , 1990 .

[48]  Richard Sinkhorn A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices , 1964 .

[49]  Shinsaku Sakaue,et al.  Differentiable Greedy Algorithm for Monotone Submodular Maximization: Guarantees, Gradient Estimators, and Applications , 2021, AISTATS.

[50]  Hongyuan Zha,et al.  Differentiable Top-k with Optimal Transport , 2020, NeurIPS.

[51]  Francis R. Bach,et al.  Learning with Differentiable Pertubed Optimizers , 2020, NeurIPS.

[52]  Petra Theunissen,et al.  Conference Paper , 2009 .

[53]  藤重 悟 Submodular functions and optimization , 1991 .