Learning GFlowNets from partial episodes for improved convergence and stability
暂无分享,去创建一个
Emmanuel Bengio | Y. Bengio | Tom Bosc | Kanika Madan | Maksym Korablyov | Moksh Jain | Jarrid Rector-Brooks | A. Nica | Nikolay Malkin
[1] Bonaventure F. P. Dossou,et al. Biological Sequence Design with GFlowNets , 2022, ICML.
[2] Chris C. Emezue,et al. Bayesian Structure Learning with Generative Flow Networks , 2022, UAI.
[3] S. Levine,et al. Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization , 2022, ICML.
[4] Aaron C. Courville,et al. Generative Flow Networks for Discrete Probabilistic Modeling , 2022, ICML.
[5] Chen Sun,et al. Trajectory Balance: Improved Credit Assignment in GFlowNets , 2022, NeurIPS.
[6] Doina Precup,et al. Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation , 2021, NeurIPS.
[7] Weinan Zhang,et al. MARS: Markov Molecular Sampling for Multi-objective Drug Discovery , 2021, ICLR.
[8] Alex Rosenthal,et al. DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics , 2020, Nucleic Acids Res..
[9] Longbo Huang,et al. Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework , 2020, AAAI.
[10] E. Xing,et al. Text Generation with Efficient (Soft) Q-Learning , 2021, ArXiv.
[11] Richard Wang,et al. AdaLead: A simple and robust adaptive greedy search algorithm for sequence design , 2020, ArXiv.
[12] Joelle Pineau,et al. Interference and Generalization in Temporal Difference Learning , 2020, ICML.
[13] Shimon Whiteson,et al. Deep Residual Reinforcement Learning , 2019, AAMAS.
[14] Larry Rudolph,et al. A Closer Look at Deep Policy Gradients , 2018, ICLR.
[15] Doina Precup,et al. Marginalized State Distribution Entropy Regularization in Policy Optimization , 2019, ArXiv.
[16] Petros Christodoulou,et al. Soft Actor-Critic for Discrete Action Settings , 2019, ArXiv.
[17] Jennifer Listgarten,et al. Conditioning by adaptive sampling for robust design , 2019, ICML.
[18] Sham M. Kakade,et al. Provably Efficient Maximum Entropy Exploration , 2018, ICML.
[19] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[20] Matteo Hessel,et al. Deep Reinforcement Learning and the Deadly Triad , 2018, ArXiv.
[21] Regina Barzilay,et al. Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.
[22] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[23] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[24] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[25] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[26] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[27] Dmitry Chudakov,et al. Local fitness landscape of the green fluorescent protein , 2016, Nature.
[28] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[29] M. Lefranc,et al. DBAASP: database of antimicrobial activity and structure of peptides. , 2014, FEMS microbiology letters.
[30] A. Voet,et al. Fragment based drug design: from experimental to computational approaches. , 2012, Current medicinal chemistry.
[31] Sergey Lyskov,et al. PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta , 2010, Bioinform..
[32] Arthur J. Olson,et al. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..
[33] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[34] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[35] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[36] David Baker,et al. Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.
[37] Michael Kearns,et al. Bias-Variance Error Bounds for Temporal Difference Updates , 2000, COLT.
[38] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[39] Jing Peng,et al. Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .