Learning Task Allocation via Multi-level Policy Gradient Algorithm with Dynamic Learning Rate
暂无分享,去创建一个
[1] Edmund H. Durfee,et al. Optimal Resource Allocation and Policy Formulation in Loosely-Coupled Markov Decision Processes , 2004, ICAPS.
[2] Lex Weaver,et al. A Multi-Agent Policy-Gradient Approach to Network Routing , 2001, ICML.
[3] Abdel-Illah Mouaddib,et al. Task selection problem under uncertainty as decision-making , 2002, AAMAS '02.
[4] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.
[5] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[6] Ian T. Foster,et al. Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.
[7] Nils J. Nilsson,et al. Artificial Intelligence , 1974, IFIP Congress.
[8] Risto Miikkulainen,et al. Confidence-based Q-Routing: An on-line adaptive network routing algorithm , 1998 .