Sequential Transfer in Reinforcement Learning with a Generative Model
暂无分享,去创建一个
Marcello Restelli | Andrea Tirinzoni | Riccardo Poiani | Marcello Restelli | Andrea Tirinzoni | Riccardo Poiani
[1] Michael Jackson,et al. Optimal Design of Experiments , 1994 .
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[4] Dit-Yan Yeung,et al. An Environment Model for Nonstationary Reinforcement Learning , 1999, NIPS.
[5] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[6] Dit-Yan Yeung,et al. Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making , 2001, Sequence Learning.
[7] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[8] Gábor Lugosi,et al. Concentration Inequalities , 2008, COLT.
[9] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[10] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[11] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[12] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.
[13] Peter Stone,et al. Value-Function-Based Transfer for Reinforcement Learning Using Structure Mapping , 2006, AAAI.
[14] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[15] Andrea Bonarini,et al. Transfer of samples in batch reinforcement learning , 2008, ICML '08.
[16] Shie Mannor,et al. Efficient reinforcement learning in parameterized models: discrete parameters , 2008, VALUETOOLS.
[17] Peter Stone,et al. Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.
[18] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[19] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[20] Massimiliano Pontil,et al. Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.
[21] L. Meng,et al. The optimal perturbation bounds of the Moore–Penrose inverse under the Frobenius norm , 2010 .
[22] Sergey Levine,et al. Nonlinear Inverse Reinforcement Learning with Gaussian Processes , 2011, NIPS.
[23] Yoonsuck Choe,et al. Directed Exploration in Reinforcement Learning with Transferred Knowledge , 2012, EWRL.
[24] Alessandro Lazaric,et al. Transfer in Reinforcement Learning: A Framework and a Survey , 2012, Reinforcement Learning.
[25] Andrew G. Barto,et al. Transfer in Reinforcement Learning via Shared Features , 2012, J. Mach. Learn. Res..
[26] Anima Anandkumar,et al. A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.
[27] M. M. Hassan Mahmud,et al. Clustering Markov Decision Processes For Continual Transfer , 2013, ArXiv.
[28] Shiliang Sun,et al. A survey of multi-view machine learning , 2013, Neural Computing and Applications.
[29] Lihong Li,et al. Sample Complexity of Multi-task Reinforcement Learning , 2013, UAI.
[30] Hilbert J. Kappen,et al. On the Sample Complexity of Reinforcement Learning with a Generative Model , 2012, ICML.
[31] Alessandro Lazaric,et al. Sequential Transfer in Multi-armed Bandit with Finite Set of Models , 2013, NIPS.
[32] Paul Weng,et al. Solving Hidden-Semi-Markov-Mode Markov Decision Problems , 2014, SUM.
[33] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..
[34] Shie Mannor,et al. Contextual Markov Decision Processes , 2015, ArXiv.
[35] Emma Brunskill,et al. A PAC RL Algorithm for Episodic POMDPs , 2016, AISTATS.
[36] Kamyar Azizzadenesheli,et al. Reinforcement Learning of POMDPs using Spectral Methods , 2016, COLT.
[37] Yao Liu,et al. PAC Continuous State Online Multitask Reinforcement Learning with Identification , 2016, AAMAS.
[38] Finale Doshi-Velez,et al. Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations , 2013, IJCAI.
[39] Benjamin Rosman,et al. Bayesian policy reuse , 2015, Machine Learning.
[40] Sinno Jialin Pan,et al. Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay , 2017, AAAI.
[41] Nan Jiang,et al. Contextual Decision Processes with low Bellman rank are PAC-Learnable , 2016, ICML.
[42] Finale Doshi-Velez,et al. Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes , 2017, AAAI.
[43] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[44] Michael L. Littman,et al. Policy and Value Transfer in Lifelong Reinforcement Learning , 2018, ICML.
[45] Xian Wu,et al. Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model , 2018, NeurIPS.
[46] Nan Jiang,et al. Markov Decision Processes with Continuous Side Information , 2017, ALT.
[47] Marcello Restelli,et al. Transfer of Value Functions via Variational Methods , 2018, NeurIPS.
[48] Marcello Restelli,et al. Importance Weighted Transfer of Samples in Reinforcement Learning , 2018, ICML.
[49] Tom Schaul,et al. Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement , 2018, ICML.
[50] Nan Jiang,et al. Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches , 2018, COLT.
[51] Mykel J. Kochenderfer,et al. Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model , 2019, NeurIPS.
[52] Wotao Yin,et al. Does Knowledge Transfer Always Help to Learn a Better Policy? , 2019, ArXiv.
[53] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[54] Alessandro Lazaric,et al. Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs , 2019, NeurIPS.
[55] Zhenguo Li,et al. Meta Reinforcement Learning with Task Embedding and Shared Policy , 2019, IJCAI.
[56] Yee Whye Teh,et al. Meta reinforcement learning as task inference , 2019, ArXiv.
[57] Marcello Restelli,et al. Transfer of Samples in Policy Search via Multiple Importance Sampling , 2019, ICML.
[58] Luisa M. Zintgraf,et al. VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning , 2019, ICLR.
[59] Single Episode Policy Transfer in Reinforcement Learning , 2019, ICLR.
[60] Alessandro Lazaric,et al. A Novel Confidence-Based Algorithm for Structured Bandits , 2020, AISTATS.