Understanding Curriculum Learning in Policy Optimization for Solving Combinatorial Optimization Problems
暂无分享,去创建一个
S. Du | Yi Wu | Yuandong Tian | Runlong Zhou
[1] Yuxin Chen,et al. Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization , 2020, Oper. Res..
[2] Shie Mannor,et al. Reinforcement Learning in Reward-Mixing MDPs , 2021, NeurIPS.
[3] Elias Boutros Khalil,et al. Deep Policies for Online Bipartite Matching: A Reinforcement Learning Approach , 2021, ArXiv.
[4] Elad Hazan,et al. A Boosting Approach to Reinforcement Learning , 2021, ArXiv.
[5] R. Srikant,et al. Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation , 2021, 2106.04096.
[6] Quoc V. Le,et al. A graph placement methodology for fast chip design , 2021, Nature.
[7] Zohar Feldman,et al. SOLO: Search Online, Learn Offline for Combinatorial Optimization Problems , 2021, SOCS.
[8] Brian T. Denton,et al. Multi-model Markov decision processes , 2021, IISE Trans..
[9] Andrea Lodi,et al. Combinatorial optimization and reasoning with graph neural networks , 2021, IJCAI.
[10] Shie Mannor,et al. RL for Latent MDPs: Regret Guarantees and a Lower Bound , 2021, NeurIPS.
[11] Susanne Albers,et al. Improved Online Algorithms for Knapsack and GAP in the Random Order Model , 2020, Algorithmica.
[12] Brendan O'Donoghue,et al. Sample Efficient Reinforcement Learning with REINFORCE , 2020, AAAI.
[13] Jalaj Bhandari,et al. On the Linear Convergence of Policy Gradient Methods for Finite MDPs , 2020, AISTATS.
[14] Evgeny Burnaev,et al. Reinforcement Learning for Combinatorial Optimization: A Survey , 2020, Comput. Oper. Res..
[15] Sham M. Kakade,et al. On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift , 2019, J. Mach. Learn. Res..
[16] Yoshua Bengio,et al. Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon , 2018, Eur. J. Oper. Res..
[17] Tianyi Zhou,et al. CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum , 2021, NeurIPS.
[18] Javad Lavaei,et al. On the Global Convergence of Momentum-based Policy Gradient , 2021, ArXiv.
[19] Jeff A. Bilmes,et al. Robust Curriculum Learning: from clean label detection to noisy label self-correction , 2021, ICLR.
[20] Jeff A. Bilmes,et al. Curriculum Learning by Optimizing Learning Dynamics , 2021, AISTATS.
[21] Wotao Yin,et al. An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods , 2022, NeurIPS.
[22] Mangpo Phitchaya Phothilimtha,et al. Transferable Graph Optimizers for ML Compilers , 2020, NeurIPS.
[23] Yoshua Bengio,et al. Mastering Rate based Curriculum Learning , 2020, ArXiv.
[24] David P. Williamson,et al. Learning to Solve Combinatorial Optimization Problems on Real-World Graphs in Linear Time , 2020, 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA).
[25] Daniel F. Perez-Ramirez,et al. Learning Combinatorial Optimization on Graphs: A Survey With Applications to Networking , 2020, IEEE Access.
[26] Tim Roughgarden,et al. Data-driven algorithm design , 2020, Commun. ACM.
[27] Csaba Szepesvari,et al. On the Global Convergence Rates of Softmax Policy Gradient Methods , 2020, ICML.
[28] Matthew E. Taylor,et al. Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey , 2020, J. Mach. Learn. Res..
[29] Zhaoran Wang,et al. Neural Policy Gradient Methods: Global Optimality and Rates of Convergence , 2019, ICLR.
[30] Jeff A. Bilmes,et al. Curriculum Learning by Dynamic Instance Hardness , 2020, NeurIPS.
[31] Yuhao Zhang,et al. Tight Competitive Ratios of Classic Matching Algorithms in the Fully Online Model , 2019, SODA.
[32] David Bergman,et al. Improving Optimization Bounds using Machine Learning: Decision Diagrams meet Deep Reinforcement Learning , 2018, AAAI.
[33] Max Welling,et al. Attention, Learn to Solve Routing Problems! , 2018, ICLR.
[34] David L. Dill,et al. Learning a SAT Solver from Single-Bit Supervision , 2018, ICLR.
[35] Zhiyi Huang,et al. Online Combinatorial Optimization Problems with Non-linear Objectives , 2019, Nonlinear Combinatorial Optimization.
[36] Javier Ruiz-del-Solar,et al. Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks , 2018, ISER.
[37] Lawrence V. Snyder,et al. Reinforcement Learning for Solving the Vehicle Routing Problem , 2018, NeurIPS.
[38] LEARNS OLD TRICKS,et al. A new dog learns old tricks: RL finds classic optimization algorithms , 2018, ICLR.
[39] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[40] Le Song,et al. 2 Common Formulation for Greedy Algorithms on Graphs , 2018 .
[41] Samy Bengio,et al. Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.
[42] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[43] Piotr Sankowski,et al. Stochastic analyses for online combinatorial optimization problems , 2008, SODA '08.
[44] Nicole Immorlica,et al. A Knapsack Secretary Problem with Applications , 2007, APPROX-RANDOM.
[45] Martin Grötschel,et al. Combinatorial Online Optimization in Real Time , 2001 .
[46] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[47] M. Beckmann,et al. Dynamic programming and the secretary problem , 1990 .