Value-Aware Loss Function for Model-based Reinforcement Learning
暂无分享,去创建一个
Daniel Nikovski | André Barreto | Amir Massoud Farahmand | A. Farahmand | André Barreto | D. Nikovski
[1] S. Geer. Empirical Processes in M-Estimation , 2000 .
[2] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[3] Adam Krzyzak,et al. A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.
[4] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[5] P. Bartlett,et al. Local Rademacher complexities , 2005, math/0508275.
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] Miguel Á. Carreira-Perpiñán,et al. On Contrastive Divergence Learning , 2005, AISTATS.
[8] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[9] Marek Petrik,et al. An Analysis of Laplacian Methods for Value Function Approximation in MDPs , 2007, IJCAI.
[10] Richard S. Sutton,et al. Reinforcement Learning of Local Shape in the Game of Go , 2007, IJCAI.
[11] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.
[12] Rémi Munos,et al. Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..
[13] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[14] Alborz Geramifard,et al. Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping , 2008, UAI.
[15] Csaba Szepesvári,et al. Model-based and Model-free Reinforcement Learning for Visual Servoing , 2009, 2009 IEEE International Conference on Robotics and Automation.
[16] Bo Liu,et al. Basis Construction from Power Series Expansions of Value Functions , 2010, NIPS.
[17] Csaba Szepesvári,et al. Error Propagation for Approximate Policy and Value Iteration , 2010, NIPS.
[18] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[19] André da Motta Salles Barreto,et al. Reinforcement Learning using Kernel-Based Stochastic Factorization , 2011, NIPS.
[20] Amir Massoud Farahmand,et al. Action-Gap Phenomenon in Reinforcement Learning , 2011, NIPS.
[21] Alborz Geramifard,et al. Online Discovery of Feature Dependencies , 2011, ICML.
[22] Peter Stone,et al. TEXPLORE: real-time sample-efficient reinforcement learning for robots , 2012, Machine Learning.
[23] Guy Lever,et al. Modelling transition dynamics in MDPs with RKHS embeddings , 2012, ICML.
[24] Doina Precup,et al. Value Pursuit Iteration , 2012, NIPS.
[25] Joelle Pineau,et al. Bellman Error Based Feature Generation using Random Projections on Sparse Spaces , 2013, NIPS.
[26] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[27] Klaus Obermayer,et al. Construction of approximation spaces for reinforcement learning , 2013, J. Mach. Learn. Res..
[28] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[29] Xinhua Zhang,et al. Pseudo-MDPs and factored linear action models , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[30] J. Andrew Bagnell,et al. Approximate MaxEnt Inverse Optimal Control and Its Application for Mental Simulation of Human Interactions , 2015, AAAI.
[31] R. Nickl,et al. Mathematical Foundations of Infinite-Dimensional Statistical Models , 2015 .
[32] Carl E. Rasmussen,et al. Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[33] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[34] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[35] Value-Aware Loss Function for Model Learning in Reinforcement Learning , 2016 .
[36] Marlos C. Machado,et al. State of the Art Control of Atari Games Using Shallow Reinforcement Learning , 2015, AAMAS.
[37] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[38] John Shawe-Taylor,et al. Compressed Conditional Mean Embeddings for Model-Based Reinforcement Learning , 2016, AAAI.
[39] Bernardo Ávila Pires,et al. Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models , 2016, COLT.