A Solution to Time-Varying Markov Decision Processes
暂无分享,去创建一个
[1] R. L. Winkler. Combining Probability Distributions from Dependent Information Sources , 1981 .
[2] James V. Zidek,et al. Time-Varying Markov Models for Binary Temperature Series in Agrorisk Management , 2012 .
[3] Alexander F. Shchepetkin,et al. The regional oceanic modeling system (ROMS): a split-explicit, free-surface, topography-following-coordinate oceanic model , 2005 .
[4] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[5] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[6] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[7] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[8] Patrick Fabiani,et al. TiMDPpoly: An Improved Method for Solving Time-Dependent MDPs , 2009, 2009 21st IEEE International Conference on Tools with Artificial Intelligence.
[9] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[10] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[11] A. Kolmogoroff. Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung , 1931 .
[12] Scott Sanner,et al. Symbolic Dynamic Programming for Discrete and Continuous State MDPs , 2011, UAI.
[13] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[14] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[15] T. Gneiting,et al. Combining probability forecasts , 2010 .
[16] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[17] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[18] Roberto Montemanni,et al. Time dependent vehicle routing problem with a multi ant colony system , 2008, Eur. J. Oper. Res..
[19] Blai Bonet,et al. Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.
[20] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[21] P. Dayan,et al. Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.
[22] V. Teles,et al. A TIME-VARYING MARKOV-SWITCHING MODEL FOR ECONOMIC GROWTH , 2015, Macroeconomic Dynamics.
[23] Michael L. Littman,et al. Exact Solutions to Time-Dependent MDPs , 2000, NIPS.
[24] Gilles Dufrénot,et al. Using time-varying transition probabilities in Markov switching processes to adjust US fiscal policy for asset prices , 2013 .
[25] A. Siegert. On the First Passage Time Probability Problem , 1951 .