A Distributed Decision-Making Structure for Dynamic Resource Allocation Using Nonlinear Functional Approximations
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[2] Edmund H. Durfee,et al. Cooperation through communication in a distributed problem-solving network , 1990 .
[3] Edmund H. Durfee,et al. Distributed artificial intelligence , 1998 .
[4] B PowellWarren,et al. An Adaptive Dynamic Programming Algorithm for Dynamic Fleet Management, II , 2002 .
[5] Warren B. Powell,et al. An Adaptive Dynamic Programming Algorithm for Dynamic Fleet Management, I: Single Period Travel Times , 2002, Transp. Sci..
[6] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[7] P. Schweitzer,et al. Generalized polynomial approximations in Markovian decision processes , 1985 .
[8] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[9] Brahim Chaib-draa,et al. An overview of distributed artificial intelligence , 1996 .
[10] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[11] Richard Withey. The convergence of convergence , 2001, Aslib Proc..
[12] Tad Hogg,et al. Spawn: A Distributed Computational Economy , 1992, IEEE Trans. Software Eng..
[13] Nicholas R. Jennings,et al. Foundations of distributed artificial intelligence , 1996, Sixth-generation computer technology series.
[14] Warren B. Powell,et al. Dynamic-Programming Approximations for Stochastic Time-Staged Integer Multicommodity-Flow Problems , 2006, INFORMS J. Comput..
[15] Rahul Simha,et al. A Microeconomic Approach to Optimal Resource Allocation in Distributed Computer Systems , 1989, IEEE Trans. Computers.
[16] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[17] Daniel D. Corkill,et al. A framework for organizational self-design in distributed problem solving networks , 1983 .
[18] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[19] Warrren B Powell,et al. An Adaptive, Distribution-Free Algorithm for the Newsvendor Problem with Censored Demands, with Applications to Inventory and Distribution , 2001 .
[20] Peter Dayan,et al. The convergence of TD(λ) for general λ , 1992, Machine Learning.
[21] R. Wets,et al. L-SHAPED LINEAR PROGRAMS WITH APPLICATIONS TO OPTIMAL CONTROL AND STOCHASTIC PROGRAMMING. , 1969 .
[22] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[23] Alan H. Bond,et al. Readings in Distributed Artificial Intelligence , 1988 .
[24] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[25] Alan H. Bond,et al. Distributed Artificial Intelligence , 1988 .
[26] Pat Langley,et al. Elements of Machine Learning , 1995 .
[27] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[28] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[29] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..