Adaptive Load Balancing: A Study in Multi-Agent Learning

We study the process of multi-agent reinforcement learning in the context of load balancing in a distributed system, without use of either central coordination or explicit communication. We first define a precise framework in which to study adaptive load balancing, important features of which are its stochastic nature and the purely local information available to individual agents. Given this framework, we show illuminating results on the interplay between basic adaptive behavior parameters and their effect on system efficiency. We then investigate the properties of adaptive load balancing in heterogeneous populations, and address the issue of exploration vs. exploitation in that context. Finally, we show that naive use of communication may not improve, and might even harm system efficiency.

[1]  Mark S. Fox,et al.  An Organizational View of Distributed Systems , 1988, IEEE Transactions on Systems, Man, and Cybernetics.

[2]  Benjamin W. Wah,et al.  Population-based learning of load balancing policies for a distributed computer system , 1993 .

[3]  Thomas W. Malone,et al.  MODELING COORDINATION IN ORGANIZATIONS AND MARKETS**Accepted by Richard M. Burton; received August 27, 1986. This paper has been with the author 2 months for 1 revision. , 1988 .

[4]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[5]  Ravi Mirchandaney,et al.  Using Stochastic Learning Automata for Job Scheduling in Distributed Processing Systems , 1986, J. Parallel Distributed Comput..

[6]  E. Thorndike “Animal Intelligence” , 1898, Nature.

[7]  M W Feldman,et al.  Selection, generalized transmission and the evolution of modifier genes. I. The reduction principle. , 1987, Genetics.

[8]  Andrew P. Kosoresow A Fast First-Cut Protocol for Agent Coordination , 1993, AAAI.

[9]  Michael P. Wellman A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[10]  J. M. Blackburn The acquisition of skill : an analysis of learning curves , 1936 .

[11]  Leslie Pack Kaelbling,et al.  Learning in embedded systems , 1993 .

[12]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[13]  Joseph Pasquale,et al.  Coadaptive behaviour in a simple distributed job scheduling system , 1993, IEEE Trans. Syst. Man Cybern..

[14]  Kumpati S. Narendra,et al.  An N-player sequential stochastic game with identical payoffs , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[15]  Edmund H. Durfee,et al.  Coherent Cooperation Among Communicating Problem Solvers , 1987, IEEE Transactions on Computers.

[16]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[17]  Edward A. Billard,et al.  Effects of delayed communication in dynamic group formation , 1993, IEEE Trans. Syst. Man Cybern..

[18]  Edward D. Lazowska,et al.  Adaptive load sharing in homogeneous distributed systems , 1986, IEEE Transactions on Software Engineering.

[19]  R. Rob,et al.  Learning, Mutation, and Long Run Equilibria in Games , 1993 .

[20]  W. Arthur Inductive Reasoning, Bounded Rationality and the Bar Problem , 1994 .

[21]  Kumpati S. Narendra,et al.  Learning Models for Decentralized Decision Making , 1985, 1985 American Control Conference.

[22]  Moshe Tennenholtz,et al.  Emergent Conventions in Multi-Agent Systems: Initial Experimental Results and Observations (Preliminary Report) , 1992, KR.

[23]  J. S. Kaufman,et al.  A case study of an adaptive load balancing algorithm , 1990, Queueing Syst. Theory Appl..

[24]  Pankaj Mehra,et al.  Automated learning of load-balancing strategies for a distributed computer system , 1993 .

[25]  J. Gerring A case study , 2011, Technology and Society.

[26]  Y. M. El-Fattah,et al.  Stochastic Automata Modeling of Certain Problems of Collective Behavior , 1980, IEEE Transactions on Systems, Man, and Cybernetics.

[27]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[28]  Donald F. Towsley,et al.  Imbedding gradient estimators in load balancing algorithms , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[29]  Dearborn Animal Intelligence: An Experimental Study of the Associative Processes in Animals , 1900 .

[30]  Edmund H. Durfee,et al.  The Utility of Communication in Coordinating Intelligent Agents , 1991, AAAI.

[31]  R. Sutton Introduction: The Challenge of Reinforcement Learning , 1992 .

[32]  Jeffrey S. Rosenschein,et al.  A Domain Theory for Task Oriented Negotiation , 1993, IJCAI.

[33]  Sarit Kraus,et al.  The Function of Time in Cooperative Negotiations , 1990, AAAI.

[34]  S. Zhou,et al.  A Trace-Driven Simulation Study of Dynamic Load Balancing , 1987, IEEE Trans. Software Eng..

[35]  Domenico Ferrari A Study of Load Indices for Load Balancing Schemes , 1985 .

[36]  Holly A. Yanco,et al.  An adaptive communication protocol for cooperating mobile robots , 1993 .

[37]  Donald F. Towsley,et al.  Analysis of the Effects of Delays on Load Sharing , 1989, IEEE Trans. Computers.

[38]  Moshe Tennenholtz,et al.  Co-Learning and the Evolution of Social Acitivity , 1994 .

[39]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[40]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[41]  Victor R. Lesser,et al.  A retrospective view of FA/C distributed problem solving , 1991, IEEE Trans. Syst. Man Cybern..

[42]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[43]  Giuseppe Serazzi,et al.  Measurement and Tuning of Computer Systems , 1984, Int. CMG Conference.