Simulation-based Algorithms for Markov Decision Processes/ Hyeong Soo Chang ... [et al.]
暂无分享,去创建一个
Steven I. Marcus | Jiaqiao Hu | Hyeong Soo Chang | Michael C. Fu | S. Marcus | M. Fu | Jiaqiao Hu | H. Chang
[1] M. Fu,et al. Optimization of discrete event systems via simultaneous perturbation stochastic approximation , 1997 .
[2] M. Fu. Convergence of a stochastic approximation algorithm for the GI/G/1 queue using infinitesimal perturbation analysis , 1990 .
[3] Reha Uzsoy,et al. A review of production planning and scheduling models in the semiconductor industry , 1994 .
[4] James C. Spall,et al. Adaptive stochastic approximation by the simultaneous perturbation method , 2000, IEEE Trans. Autom. Control..
[5] V. Borkar. Stochastic approximation with two time scales , 1997 .
[6] James C. Spall,et al. A one-measurement form of simultaneous perturbation stochastic approximation , 1997, Autom..
[7] James C. Spall,et al. AN OVERVIEW OF THE SIMULTANEOUS PERTURBATION METHOD FOR EFFICIENT OPTIMIZATION , 1998 .
[8] David D. Yao,et al. A queueing network model for semiconductor manufacturing , 1996 .
[9] John N. Tsitsiklis,et al. Call admission control and routing in integrated services networks using neuro-dynamic programming , 2000, IEEE Journal on Selected Areas in Communications.
[10] S. Mitter,et al. Recursive stochastic algorithms for global optimization in R d , 1991 .
[11] J. Spall,et al. Simulation-Based Optimization with Stochastic Approximation Using Common Random Numbers , 1999 .
[12] Michael C. Fu,et al. Optimal structured feedback policies for ABR flow control using two-timescale SPSA , 2001, TNET.
[13] S. Marcus,et al. A Simulation-Based Policy Iteration Algorithm for Average Cost Unichain Markov Decision Processes , 2000 .
[14] L. Gerencser,et al. SPSA for non-smooth optimization with application in ECG analysis , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).
[15] Dimitri P. Bertsekas,et al. Missile defense and interceptor allocation by neuro-dynamic programming , 2000, IEEE Trans. Syst. Man Cybern. Part A.
[16] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[17] A. Ruszczynski,et al. Stochastic approximation method with gradient averaging for unconstrained problems , 1983 .
[18] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[19] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[20] Payman Sadegh,et al. Constrained optimization via stochastic approximation with a simultaneous perturbation gradient approximation , 1997, Autom..
[21] Xi-Ren Cao,et al. Perturbation analysis of discrete event dynamic systems , 1991 .
[22] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[23] Shanling Li,et al. Dynamic Capacity Expansion Problem with Multiple Products: Technology Selection and Timing of Capacity Additions , 1994, Oper. Res..
[24] Stuart Bermon,et al. Capacity analysis of complex manufacturing facilities , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.
[25] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[26] Paul Glasserman,et al. Gradient Estimation Via Perturbation Analysis , 1990 .
[27] Stochastic approximation for global random optimization , 2000, Proceedings of the 2000 American Control Conference. ACC (IEEE Cat. No.00CH36334).
[28] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[29] J. Spall. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .
[30] X. Cao,et al. Single Sample Path-Based Optimization of Markov Chains , 1999 .
[31] Vivek S. Borkar,et al. Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..
[32] Vivek S. Borkar,et al. Multiscale Stochastic Approximation for Parametric Optimization of Hidden Markov Models , 1997, Probability in the Engineering and Informational Sciences.
[33] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[34] P. Glynn,et al. Stochastic optimization by simulation: numerical experiments with the M / M /1 queue in steady-state , 1994 .
[35] Dirk Beyer,et al. Stochastic Multiproduct Inventory Models with Limited Storage , 2001 .
[36] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[37] J. Spall. Implementation of the simultaneous perturbation algorithm for stochastic optimization , 1998 .
[38] S. Bhatnagar,et al. A two timescale stochastic approximation scheme for simulation-based parametric optimization , 1998 .
[39] Michael C. Fu,et al. Sample Path Derivatives for (s, S) Inventory Systems , 1994, Oper. Res..
[40] E. Chong,et al. A deterministic analysis of stochastic approximation with randomized directions , 1998, IEEE Transactions on Automatic Control.
[41] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[42] E. Chong,et al. Optimization of queues using an infinitesimal perturbation analysis-based stochastic algorithm with general update times , 1993 .
[43] Xi-Ren Cao,et al. Semi-Markov decision problems and performance sensitivity analysis , 2003, IEEE Trans. Autom. Control..
[44] Han-Fu Chen,et al. A Stochastic Approximation Algorithm with Random Differences , 1996 .
[45] Gang George Yin. Rates of Convergence for a Class of Global Stochastic Optimization Algorithms , 1999, SIAM J. Optim..
[46] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[47] Vivek S. Borkar,et al. The actor-critic algorithm as multi-time-scale stochastic approximation , 1997 .
[48] J. Spall,et al. Model-free control of nonlinear stochastic systems with discrete-time measurements , 1998, IEEE Trans. Autom. Control..
[49] Harold J. Kushner,et al. Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.
[50] P. Varaiya. Convergence of a Stochastic Approximation Algorithm for the GI / G / 1 Queue Using Infinitesimal Perturbation Analysis , 1990 .
[51] H. Kushner. Asymptotic global behavior for stochastic approximation and diffusions with slowly decreasing noise effects: Global minimization via Monte Carlo , 1987 .
[52] Benjamin Van Roy,et al. A neuro-dynamic programming approach to retailer inventory management , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[53] J. Dippon,et al. Weighted Means in Stochastic Approximation of Minima , 1997 .
[54] Reha Uzsoy,et al. A REVIEW OF PRODUCTION PLANNING AND SCHEDULING MODELS IN THE SEMICONDUCTOR INDUSTRY PART I: SYSTEM CHARACTERISTICS, PERFORMANCE EVALUATION AND PRODUCTION PLANNING , 1992 .
[55] László Gerencsér,et al. Convergence rate of moments in stochastic approximation with simultaneous perturbation gradient approximation and resetting , 1999, IEEE Trans. Autom. Control..
[56] Patrick M. Fitzpatrick. Advanced Calculus: A Course in Mathematical Analysis , 1995 .
[57] Mark S. Fox,et al. Intelligent Scheduling , 1998 .
[58] D. Varberg. Convex Functions , 1973 .
[59] E. Fernandez-Gaucherand,et al. S/sup 2/YSCODE: stochastic systems control and decision algorithms software laboratory, FORTRAN and MATLAB versions , 1994, Proceedings of IEEE Symposium on Computer-Aided Control Systems Design (CACSD).
[60] Jian-Qiang Hu,et al. Conditional Monte Carlo: Gradient Estimation and Optimization Applications , 2012 .
[61] S. Bhatnagar,et al. Two-timescale algorithms for simulation optimization of hidden Markov models , 2001 .
[62] David D. Yao,et al. Capacity allocation in semiconductor fabrication , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).
[63] Vivek S. Borkar,et al. Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..
[64] D. Tirupati,et al. Technology choice with stochastic demands and dynamic capacity allocation: A two‐product analysis , 1995 .
[65] Jayashankar M. Swaminathan. Tool capacity planning for semiconductor fabrication facilities under demand uncertainty , 2000, Eur. J. Oper. Res..
[66] John N. Tsitsiklis,et al. Average cost temporal-difference learning , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[67] Steven I. Marcus,et al. Simulation-Based Algorithms for Average Cost Markov Decision Processes , 1999 .
[68] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .
[69] Peter Marbach,et al. Simulation-based optimization of Markov decision processes , 1998 .
[70] P. Glynn,et al. Stochastic Optimization by Simulation: Convergence Proofs for the GI/G/1 Queue in Steady-State , 1994 .
[71] E. Chong,et al. Stochastic optimization of regenerative systems using infinitesimal perturbation analysis , 1994, IEEE Trans. Autom. Control..
[72] Shanling Li,et al. Impact of product mix flexibility and allocation policies on technology , 1997, Comput. Oper. Res..