Randomized function fitting-based empirical value iteration
暂无分享,去创建一个
Pengqian Yu | William B. Haskell | Rahul Jain | Hiteshi Sharma | R. Jain | W. Haskell | Hiteshi Sharma | Pengqian Yu
[1] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[2] William B. Haskell,et al. Empirical Dynamic Programming , 2013, Math. Oper. Res..
[3] A. Rahimi,et al. Uniform approximation of functions with random bases , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.
[4] John Rust. Using Randomization to Break the Curse of Dimensionality , 1997 .
[5] Benjamin Van Roy,et al. On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..
[6] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[7] Dirk Ormoneit,et al. Kernel-Based Reinforcement Learning , 2017, Encyclopedia of Machine Learning and Data Mining.
[8] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[9] Benjamin Recht,et al. Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.
[10] Anthony Almudevar. Approximate Fixed Point Iteration with an Application to Infinite Horizon Markov Decision Processes , 2008, SIAM J. Control. Optim..
[11] Bharath Rangarajan,et al. Performance Guarantees for Empirical Markov Decision Processes with Applications to Multiperiod Inventory Models , 2012, Oper. Res..
[12] Guy Lever,et al. Modelling transition dynamics in MDPs with RKHS embeddings , 2012, ICML.
[13] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[14] Warren B. Powell,et al. Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .
[15] Dimitri P. Bertsekas,et al. Dynamic programming and optimal control, 3rd Edition , 2005 .
[16] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .