Multitime scale Markov decision processes
暂无分享,去创建一个
Mark A. Shayman | Hyeong Soo Chang | Steven I. Marcus | Pedram Jaefari Fard | S. Marcus | M. Shayman | P. Fard | H. Chang
[1] Magdi S. Mahmoud,et al. Multilevel Systems Control and Applications: A Survey , 1977, IEEE Transactions on Systems, Man, and Cybernetics.
[2] P. Varaiya,et al. Multilayer control of large Markov chains , 1978 .
[3] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .
[4] Jean Walrand,et al. Extensions of the multiarmed bandit problem: The discounted case , 1985 .
[5] Gabriel R. Bitran,et al. Production Planning of Style Goods with High Setup Costs and Forecast Revisions , 1986, Oper. Res..
[6] Stanley B. Gershwin,et al. Hierarchical flow control: a framework for scheduling and planning discrete events in manufacturing systems , 1989, Proc. IEEE.
[7] O. Hernández-Lerma. Adaptive Markov Control Processes , 1989 .
[8] O. Hernández-Lerma,et al. Error bounds for rolling horizon policies in discrete-time Markov control processes , 1990 .
[9] Kishor S. Trivedi,et al. A methodology for formal expression of hierarchy in model solution , 1993, Proceedings of 5th International Workshop on Petri Nets and Performance Models.
[10] Wolfgang Fischer,et al. The Markov-Modulated Poisson Process (MMPP) Cookbook , 1993, Perform. Evaluation.
[11] M. K. Ghosh,et al. Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .
[12] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[13] Qing Zhang,et al. Hierarchical Decision Making in Stochastic Manufacturing Systems , 1994 .
[14] T. Başar,et al. Multi-time scale zero-sum differential games with perfect state measurements , 1995 .
[15] Walter Willinger,et al. Self-Similarity in High-Speed Packet Traffic: Analysis and Modeling of Ethernet Traffic Measurements , 1995 .
[16] John N. Tsitsiklis,et al. Statistical Multiplexing of Multiple Time-Scale Markov Streams , 1995, IEEE J. Sel. Areas Commun..
[17] Leslie Pack Kaelbling,et al. On the Complexity of Solving Markov Decision Problems , 1995, UAI.
[18] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[19] Ralph Neuneier,et al. Optimal Asset Allocation using Adaptive Dynamic Programming , 1995, NIPS.
[20] Kishor S. Trivedi,et al. Markov Dependability Models of Complex Systems: Analysis Techniques , 1996 .
[21] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[22] Alfred Müller,et al. How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities? , 1997, Math. Oper. Res..
[23] Sean P. Meyn. The policy iteration algorithm for average reward Markov decision processes with general state space , 1997, IEEE Trans. Autom. Control..
[24] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[25] Dimitri P. Bertsekas,et al. Rollout Algorithms for Stochastic Scheduling Problems , 1999, J. Heuristics.
[26] Yong-Pin Zhou,et al. A Single-Server Queue with Markov Modulated Service Times , 1999 .
[27] Kihong Park,et al. Multiple Time Scale Congestion Control for Self-Similar Network Traffic , 1999, Perform. Evaluation.
[28] Nicola Secomandi,et al. Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands , 2000, Comput. Oper. Res..
[29] Robert Givan,et al. A framework for simulation-based network control via hindsight optimization , 2000, Proceedings of the 39th IEEE Conference on Decision and Control (Cat. No.00CH37187).
[30] Robert Givan,et al. On-line Scheduling via Sampling , 2000, AIPS.
[31] Kishor S. Trivedi,et al. Stochastic Modeling Formalisms for Dependability, Performance and Performability , 2000, Performance Evaluation.
[32] Bruce H. Krogh,et al. Mode-matching control policies for multi-mode Markov decision processes , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).
[33] Gang Wu,et al. Congestion control via online sampling , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).
[34] P. Glynn,et al. Hoeffding's inequality for uniformly ergodic Markov chains , 2002 .
[35] Noah Gans,et al. Managing Learning and Turnover in Employee Staffing , 1999, Oper. Res..
[36] David Tse,et al. A time-scale decomposition approach to measurement-based admission control , 2003, TNET.
[37] Vishal Sharma,et al. Framework for Multi-Protocol Label Switching (MPLS)-based Recovery , 2003, RFC.
[38] S. Marcus,et al. Approximate receding horizon approach for Markov decision processes: average reward case , 2003 .
[39] Robert Givan,et al. Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes , 2004, Discret. Event Dyn. Syst..
[40] Satinder Singh,et al. An upper bound on the loss from approximate optimal-value functions , 1994, Machine Learning.
[41] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[42] Michael C. Fu,et al. An Adaptive Sampling Algorithm for Solving Markov Decision Processes , 2005, Oper. Res..