State-Dependent Packet Scheduling for QoS Routing in Dynamic Networks

The packet scheduling in router plays an important role in the sense to achieve QoS differentiation and to optimize the queuing delay, in particular when this optimization is accomplished on all routers of a path between source and destination. In a dynamically changing environment a good scheduling discipline should be also adaptive to the new traffic conditions. We model this problem as a multi-agent system in which each agent learns through continual interaction with the environment in order to optimize its own behaviour. So, we adopt the framework of Markov decision processes applied to multi-agent system and present a pheromone-Q learning approach which combines the Q-multi-learning technique with a synthetic pheromone that acts as a communication medium speeding up the learning process of cooperating agents.