论文信息 - Towards One-shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case

Towards One-shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case

One-shot non-autoregressive neural networks, different from RL-based ones, have been actively adopted for solving combinatorial optimization (CO) problems, which can be trained by the objective score in a self-supervised manner. Such methods have shown their superiority in efficiency (e.g. by parallelization) and potential for tackling predictive CO problems for decision-making under uncertainty. While the discrete constraints often become a bottleneck for gradient-based neural solvers, as currently handled in three typical ways: 1) adding a soft penalty in the objective, where a bounded violation of the constraints cannot be guaranteed, being critical to many constraint-sensitive scenarios; 2) perturbing the input to generate an approximate gradient in a black-box manner, though the constraints are exactly obeyed while the approximate gradients can hurt the performance on the objective score; 3) a compromise by developing soft algorithms whereby the output of neural networks obeys a relaxed constraint, and there can still occur an arbitrary degree of constraint-violation. Towards the ultimate goal of establishing a general framework for neural CO solver with the ability to control an arbitrarysmall degree of constraint violation, in this paper, we focus on a more achievable and common setting: the cardinality constraints, which in fact can be readily encoded by a differentiable optimal transport (OT) layer. Based on this observation, we propose OT-based cardinality constraint encoding for end-to-end CO problem learning with two variants: Sinkhorn and Gumbel-Sinkhorn, whereby their violation of the constraints can be exactly characterized and bounded by our theoretical results. On synthetic and real-world CO problem instances, our methods surpass the state-of-the-art CO network and are comparable to (if not superior to) the commercial solver Gurobi. In particular, we further showcase a case study of applying our approach to the predictive portfolio optimization task on real-world asset price data, improving the Sharpe ratio from 1.1 to 2.0 of a strong LSTM+Gurobi baseline under the classic predict-then-optimize paradigm.

[1] Junchi Yan,et al. A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs , 2021, NeurIPS.

[2] Wei Chen,et al. Network Inference and Influence Maximization from Samples , 2021, ICML.

[3] G. Martius,et al. CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints , 2021, ICML.

[4] Jie Zhang,et al. Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning , 2020, NeurIPS.

[5] Junchi Yan,et al. Combinatorial Learning of Robust Deep Graph Matching: An Embedding Based Approach , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Andreas Loukas,et al. Erdos Goes Neural: an Unsupervised Learning Framework for Combinatorial Optimization on Graphs , 2020, NeurIPS.

[7] Baoxin Li,et al. Learning deep graph matching with channel-independent embedding and Hungarian attention , 2020, ICLR.

[8] Hao Lu,et al. A Learning-based Iterative Method for Solving Vehicle Routing Problems , 2020, ICLR.

[9] Nils M. Kriege,et al. Deep Graph Matching Consensus , 2020, ICLR.

[10] G. Martius,et al. Differentiation of Blackbox Combinatorial Solvers , 2019, ICLR.

[11] Junchi Yan,et al. Neural Graph Matching Network: Learning Lawler’s Quadratic Assignment Problem With Extension to Hypergraph and Multiple-Graph Matching , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Tomasz Malisiewicz,et al. SuperGlue: Learning Feature Matching With Graph Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Stephen P. Boyd,et al. Differentiable Convex Optimization Layers , 2019, NeurIPS.

[14] Rik Sarkar,et al. Multi-scale Attributed Node Embedding , 2019, J. Complex Networks.

[15] Jing Liu,et al. Gumbel-softmax Optimization: A Simple General Framework for Combinatorial Optimization Problems on Graphs , 2019, COMPLEX NETWORKS.

[16] Andry Alamsyah,et al. Forecasting Portfolio Optimization using Artificial Neural Network and Genetic Algorithm , 2019, 2019 7th International Conference on Information and Communication Technology (ICoICT).

[17] Vladlen Koltun,et al. The Limited Multi-Label Projection Layer , 2019, ArXiv.

[18] Marco Cuturi,et al. Differentiable Ranks and Sorting using Optimal Transport , 2019, 1905.11885.

[19] Priya L. Donti,et al. SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver , 2019, ICML.

[20] Oriol Vinyals,et al. Graph Matching Networks for Learning the Similarity of Graph Structured Objects , 2019, ICML.

[21] Junchi Yan,et al. Learning Combinatorial Embedding Networks for Deep Graph Matching , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22] Jan Eric Lenssen,et al. Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[23] S. Ermon,et al. Stochastic Optimization of Sorting Networks via Continuous Relaxations , 2019, ICLR.

[24] Xinyun Chen,et al. Learning to Perform Local Rewriting for Combinatorial Optimization , 2018, NeurIPS.

[25] Milind Tambe,et al. Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization , 2018, AAAI.

[26] Yizhou Sun,et al. SimGNN: A Neural Network Approach to Fast Graph Similarity Computation , 2018, WSDM.

[27] Cristian Sminchisescu,et al. Deep Learning of Graph Matching , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28] Joan Bruna,et al. REVISED NOTE ON LEARNING QUADRATIC ASSIGNMENT WITH GRAPH NEURAL NETWORKS , 2018, 2018 IEEE Data Science Workshop (DSW).

[29] Scott W. Linderman,et al. Learning Latent Permutations with Gumbel-Sinkhorn Networks , 2018, ICLR.

[30] Heinrich Müller,et al. SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31] Adam N. Elmachtoub,et al. Smart "Predict, then Optimize" , 2017, Manag. Sci..

[32] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[33] Jure Leskovec,et al. Inductive Representation Learning on Large Graphs , 2017, NIPS.

[34] Elias Boutros Khalil,et al. Learning Combinatorial Optimization Algorithms over Graphs , 2017, NIPS.

[35] J. Zico Kolter,et al. OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[36] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[37] Tianbao Yang,et al. Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon) , 2016, NIPS.

[38] Le Song,et al. Discriminative Embeddings of Latent Variable Models for Structured Data , 2016, ICML.

[39] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.

[40] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41] Joseph Naor,et al. Submodular Maximization with Cardinality Constraints , 2014, SODA.

[42] Marco Cuturi,et al. Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[43] Lin Xiao,et al. A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem , 2012, SIAM J. Optim..

[44] Yazid M. Sharaiha,et al. Heuristics for cardinality constrained portfolio optimisation , 2000, Comput. Oper. Res..

[45] Samir Khuller,et al. The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[46] W. Sharpe. The Sharpe Ratio , 1994 .

[47] John E. Beasley,et al. OR-Library: Distributing Test Problems by Electronic Mail , 1990 .

[48] Richard Sinkhorn. A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices , 1964 .

[49] Shinsaku Sakaue,et al. Differentiable Greedy Algorithm for Monotone Submodular Maximization: Guarantees, Gradient Estimators, and Applications , 2021, AISTATS.

[50] Hongyuan Zha,et al. Differentiable Top-k with Optimal Transport , 2020, NeurIPS.

[51] Francis R. Bach,et al. Learning with Differentiable Pertubed Optimizers , 2020, NeurIPS.

[52] Petra Theunissen,et al. Conference Paper , 2009 .

[53] 藤重悟. Submodular functions and optimization , 1991 .