论文信息 - Advances in Distributed Optimization Using Probability Collectives

Advances in Distributed Optimization Using Probability Collectives

Recent work has shown how information theory extends conventional full-rationality game theory to allow bounded rational agents. The associated mathematical framework can be used to solve distributed optimization and control problems. This is done by translating the distributed problem into an iterated game, where each agent's mixed strategy (i.e. its stochastically determined move) sets a different variable of the problem. So the expected value of the objective function of the distributed problem is determined by the joint probability distribution across the moves of the agents. The mixed strategies of the agents are updated from one game iteration to the next so as to converge on a joint distribution that optimizes that expected value of the objective function. Here, a set of new techniques for this updating is presented. These and older techniques are then extended to apply to uncountable move spaces. We also present an extension of the approach to include (in)equality constraints over the underlying variables. Another contribution is that we show how to extend the Monte Carlo version of the approach to cases where some agents have no Monte Carlo samples for some of their moves, and derive an "automatic annealing schedule".

[1] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[2] David G. Stork,et al. Pattern Classification , 1973 .

[3] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .

[4] David H. Wolpert,et al. Discrete, Continuous, and Constrained Optimization Using Collectives , 2004 .

[5] D. Fudenberg,et al. Steady state learning and Nash equilibrium , 1993 .

[6] Dirk P. Kroese,et al. Cross‐Entropy Method , 2011 .

[7] David M. Kreps,et al. Learning Mixed Equilibria , 1993 .

[8] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[9] David H. Wolpert,et al. Flight Control with Distributed Efiectors , 2005 .

[10] Kagan Tumer,et al. Improving Search Algorithms by Using Intelligent Coordinates , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11] Paul A. Viola,et al. MIMIC: Finding Optima by Estimating Probability Densities , 1996, NIPS.

[12] David H. Wolpert,et al. Product Distribution Field Theory , 2003, ArXiv.

[13] Kagan Tumer,et al. Collective Intelligence, Data Routing and Braess' Paradox , 2002, J. Artif. Intell. Res..

[14] David H. Wolpert,et al. Information Theory - The Bridge Connecting Bounded Rational Game Theory and Statistical Physics , 2004, ArXiv.

[15] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .

[16] Michael I. Jordan,et al. Reinforcement Learning by Probability Matching , 1995, NIPS 1995.

[17] Kagan Tumer,et al. Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[18] David H. Wolpert,et al. Distributed control by Lagrangian steepest descent , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[19] D. Wolpert,et al. Self-dissimilarity as a High Dimensional Complexity Measure , 2005 .

[20] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[21] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[22] Jeff S. Shamma,et al. Dynamic fictitious play, dynamic gradient play, and distributed convergence to Nash equilibria , 2005, IEEE Transactions on Automatic Control.

[23] Stefan R. Bieniawski,et al. Adaptive Multi-Agent Systems for Constrained Optimization , 2004 .

[24] D. Wolpert,et al. Product Distribution Theory and Semi-Coordinate Transformations , 2004 .

[25] David H. Wolpert,et al. Adaptive, distributed control of constrained multi-agent systems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[26] Kagan Tumer,et al. Using Collective Intelligence to Route Internet Traffic , 1998, NIPS.

[27] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[28] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.

[29] Ilan Kroo,et al. Fleet Assignment Using Collective Intelligence , 2004 .

[30] David H. Wolpert,et al. Distributed Constrained Optimization with Semicoordinate Transformations , 2008, ArXiv.

[31] David G. Stork,et al. Pattern Classification (2nd ed.) , 1999 .