暂无分享,去创建一个
Beren Millidge | Alexander Tschantz | Christopher L Buckley | Anil K Seth | Beren Millidge | Alexander Tschantz | A. Seth | C. Buckley
[1] Chong Wang,et al. Stochastic variational inference , 2012, J. Mach. Learn. Res..
[2] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[3] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[4] Manuel Baltieri,et al. PID Control as a Process of Active Inference with Linear Generative Models † , 2019, Entropy.
[5] Dirk P. Kroese,et al. The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics) , 2004 .
[6] Amos Storkey,et al. Advances in Neural Information Processing Systems 20 , 2007 .
[7] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[8] Marc Toussaint,et al. An Approximate Inference Approach to Temporal Optimization in Optimal Control , 2010, NIPS.
[9] Emanuel Todorov,et al. Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.
[10] Stefan Schaal,et al. Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.
[11] Hilbert J. Kappen,et al. Adaptive Importance Sampling for Control and Inference , 2015, ArXiv.
[12] William T. Freeman,et al. Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.
[13] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[14] Sean Gerrish,et al. Black Box Variational Inference , 2013, AISTATS.
[15] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..
[16] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.
[17] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[18] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[19] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[20] Yisong Yue,et al. Iterative Amortized Inference , 2018, ICML.
[21] Simone Carlo Surace,et al. The Hitchhiker’s guide to nonlinear filtering , 2019, Journal of Mathematical Psychology.
[22] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[23] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[24] Reuven Y. Rubinstein,et al. Optimization of computer simulation models with rare events , 1997 .
[25] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..
[26] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[27] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[28] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[29] Allan Jabri,et al. Universal Planning Networks , 2018, ICML.
[30] Lih-Yuan Deng,et al. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning , 2006, Technometrics.
[31] Matthew J. Beal. Variational algorithms for approximate Bayesian inference , 2003 .
[32] Tadahiro Taniguchi,et al. Acceleration of Gradient-Based Path Integral Method for Efficient Optimal and Inverse Optimal Control , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[33] Alexander M. Rush,et al. Semi-Amortized Variational Autoencoders , 2018, ICML.
[34] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[35] Sergey Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[36] Tadahiro Taniguchi,et al. PlaNet of the Bayesians: Reconsidering and Improving Deep Planning Network by Incorporating Bayesian Inference , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[37] William T. Freeman,et al. Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.
[38] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[39] Yoshua Bengio,et al. Probabilistic Planning with Sequential Monte Carlo methods , 2018, ICLR.
[40] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[41] Matthew Fellows,et al. VIREL: A Variational Inference Framework for Reinforcement Learning , 2018, NeurIPS.
[42] Charles M. Bishop,et al. Variational Message Passing , 2005, J. Mach. Learn. Res..
[43] Tadahiro Taniguchi,et al. Variational Inference MPC for Bayesian Model-based Reinforcement Learning , 2019, CoRL.
[44] Sergey Levine,et al. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.
[45] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[46] Evangelos A. Theodorou,et al. Model Predictive Path Integral Control: From Theory to Parallel Computation , 2017 .
[47] Nikolaus Hansen,et al. A restart CMA evolution strategy with increasing population size , 2005, 2005 IEEE Congress on Evolutionary Computation.
[48] J. Yedidia. Message-Passing Algorithms for Inference and Optimization , 2011 .