暂无分享,去创建一个
Shimon Whiteson | Sebastian Schulze | Katja Hofmann | Yarin Gal | Kyriacos Shiarlis | Luisa Zintgraf | Maximilian Igl
[1] Li Zhang,et al. Learning to Learn: Meta-Critic Networks for Sample Efficient Learning , 2017, ArXiv.
[2] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[3] Zoran Popovic,et al. Trading Off Scientific Knowledge and User Learning with Multi-Armed Bandits , 2014, EDM.
[4] Peter Dayan,et al. Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search , 2013, J. Artif. Intell. Res..
[5] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[6] Lihong Li,et al. Policy Certificates: Towards Accountable Reinforcement Learning , 2018, ICML.
[7] P. Randolph. Bayesian Decision Problems and Markov Chains , 1968 .
[8] Danica Kragic,et al. VPE: Variational Policy Embedding for Transfer Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[9] Zhenguo Li,et al. Meta Reinforcement Learning with Task Embedding and Shared Policy , 2019, IJCAI.
[10] Pratik Shah,et al. Reinforcement Learning with Action-Derived Rewards for Chemotherapy and Clinical Trial Dosing Regimen Selection , 2018, MLHC.
[11] Yee Whye Teh,et al. Meta reinforcement learning as task inference , 2019, ArXiv.
[12] R. Bellman. A PROBLEM IN THE SEQUENTIAL DESIGN OF EXPERIMENTS , 1954 .
[13] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[14] Sebastian Nowozin,et al. Meta-Learning Probabilistic Inference for Prediction , 2018, ICLR.
[15] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[16] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[17] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[18] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[19] Finale Doshi-Velez,et al. Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes , 2017, AAAI.
[20] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[21] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[22] Joelle Pineau,et al. Decoupling Dynamics and Reward for Transfer Learning , 2018, ICLR.
[23] J. Schulman,et al. Reptile: a Scalable Metalearning Algorithm , 2018 .
[24] Michael L. Littman,et al. Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search , 2011, UAI.
[25] Nando de Freitas,et al. Robust Imitation of Diverse Behaviors , 2017, NIPS.
[26] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[27] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[28] Katja Hofmann,et al. Variational Inference for Data-Efficient Model Learning in POMDPs , 2018, ArXiv.
[29] Siddhartha S. Srinivasa,et al. Bayesian Policy Optimization for Model Uncertainty , 2018, ICLR.
[30] Ambuj Tewari,et al. Contextual Markov Decision Processes using Generalized Linear Models , 2019, ArXiv.
[31] Richard L. Lewis,et al. Variance-Based Rewards for Approximate Bayesian Reinforcement Learning , 2010, UAI.
[32] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[33] Felipe Petroski Such,et al. Efficient transfer learning and online adaptation with latent variable models for continuous control , 2018, ArXiv.
[34] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[35] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.
[36] Mike Wu,et al. Meta-Amortized Variational Inference and Learning , 2019, AAAI.
[37] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[38] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[39] Pieter Abbeel,et al. Evolved Policy Gradients , 2018, NeurIPS.
[40] Pieter Abbeel,et al. Meta-Learning with Temporal Convolutions , 2017, ArXiv.
[41] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[42] KearnsMichael,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002 .
[43] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[44] Katja Hofmann,et al. Fast Context Adaptation via Meta-Learning , 2018, ICML.
[45] Pieter Abbeel,et al. Some Considerations on Learning to Explore via Meta-Reinforcement Learning , 2018, ICLR 2018.
[46] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[47] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[48] Shie Mannor,et al. Contextual Markov Decision Processes , 2015, ArXiv.
[49] Yee Whye Teh,et al. Neural Processes , 2018, ArXiv.
[50] Finale Doshi-Velez,et al. Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations , 2013, IJCAI.
[51] Peter Dayan,et al. Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search , 2012, NIPS.
[52] Yee Whye Teh,et al. Meta-learning of Sequential Strategies , 2019, ArXiv.
[53] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[54] Alessandro Lazaric,et al. Rewards and errors in multi-arm bandits for interactive education , 2016, NIPS 2016.
[55] Emma Brunskill,et al. Bayes-optimal reinforcement learning for discrete uncertainty domains , 2012, AAMAS.
[56] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[57] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[58] Alexander J. Smola,et al. Deep Sets , 2017, 1703.06114.
[59] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[60] Sergey Levine,et al. Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings , 2018, ICML.
[61] Leslie Pack Kaelbling,et al. Bayesian Policy Search with Policy Priors , 2011, IJCAI.
[62] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[63] Tamim Asfour,et al. ProMP: Proximal Meta-Policy Search , 2018, ICLR.
[64] Nan Jiang,et al. Contextual Decision Processes with low Bellman rank are PAC-Learnable , 2016, ICML.
[65] Katja Hofmann,et al. Meta Reinforcement Learning with Latent Variable Gaussian Processes , 2018, UAI.