The analysis and performance evaluation of the pheromone‐Q‐learning algorithm

Abstract: The paper presents the pheromone-Q-learning (Phe-Q) algorithm, a variation of Q-learning. The technique was developed to allow agents to communicate and jointly learn to solve a problem. Phe-Q learning combines the standard Q-learning technique with a synthetic pheromone that acts as a communication medium speeding up the learning process of cooperating agents. The Phe-Q update equation includes a belief factor that reflects the confidence an agent has in the pheromone (the communication medium) deposited in the environment by other agents. With the Phe-Q update equation, the speed of convergence towards an optimal solution depends on a number of parameters including the number of agents solving a problem, the amount of pheromone deposit, the diffusion into neighbouring cells and the evaporation rate. The main objective of this paper is to describe and evaluate the performance of the Phe-Q algorithm. The paper demonstrates the improved performance of cooperating Phe-Q agents over non-cooperating agents. The paper also shows how Phe-Q learning can be improved by optimizing all the parameters that control the use of the synthetic pheromone.

[1]  Gianni A. Di Caro,et al.  AntNet: A Mobile Agents Approach to Adaptive Routing , 1999 .

[2]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[3]  Luca Maria Gambardella,et al.  Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem , 1995, ICML.

[4]  Marco Dorigo,et al.  Two Ant Colony Algorithms for Best-Effort Routing in Datagram Networks , 1998 .

[5]  Dorothy Ndedi Monekosso,et al.  Phe-Q: A Pheromone Based Q-Learning , 2001, Australian Joint Conference on Artificial Intelligence.

[6]  Richard F. Hartl,et al.  Applying the ANT System to the Vehicle Routing Problem , 1999 .

[7]  H. Van Dyke Parunak,et al.  Ant-like missionaries and cannibals: synthetic pheromones for distributed motion control , 2000, AGENTS '00.

[8]  Thomas Stützle,et al.  ACO Algorithms for the Travelling Salesman Problem , 1999 .

[9]  Ming Tan,et al.  Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[10]  Gaurav S. Sukhatme,et al.  LOST: localization-space trails for robot teams , 2002, IEEE Trans. Robotics Autom..

[11]  Luca Maria Gambardella,et al.  A Study of Some Properties of Ant-Q , 1996, PPSN.

[12]  E. Wilson,et al.  Journey to the Ants , 1990 .

[13]  M Dorigo,et al.  Ant colonies for the quadratic assignment problem , 1999, J. Oper. Res. Soc..

[14]  Gaurav S. Sukhatme,et al.  Whistling in the dark: cooperative trail following in uncertain localization space , 2000, AGENTS '00.

[15]  Gian Luca Foresti,et al.  A distributed probabilistic system for adaptive regulation of image processing parameters , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[16]  J. Deneubourg,et al.  Self-organized shortcuts in the Argentine ant , 1989, Naturwissenschaften.

[17]  J. Deneubourg,et al.  Collective decision making through food recruitment , 1990, Insectes Sociaux.

[18]  Dorothy Ndedi Monekosso,et al.  An Analysis of the Pheromone Q-Learning Algorithm , 2002, IBERAMIA.

[19]  Antonella Carbonaro,et al.  An ANTS Heuristic for the Long — Term Car Pooling Problem , 2004 .

[20]  J. Deneubourg,et al.  Collective patterns and decision-making , 1989 .

[21]  H. Van Dyke Parunak,et al.  Mechanisms and Military Applications for Synthetic Pheromones , 2001 .

[22]  A. Wren,et al.  An Ant System for Bus Driver Scheduling 1 , 1997 .

[23]  H. Van Dyke Parunak,et al.  Digital pheromone mechanisms for coordination of unmanned vehicles , 2002, AAMAS '02.

[24]  J. Deneubourg,et al.  Trails and U-turns in the Selection of a Path by the Ant Lasius niger , 1992 .

[25]  Marco Dorigo,et al.  Swarm intelligence: from natural to artificial systems , 1999 .

[26]  M. -C. Cammaerts-Tricot,et al.  Piste et phéromone attractive chez la fourmiMyrmica rubra , 1974, Journal of comparative physiology.