Combining Rule Induction and Reinforcement Learning: An Agent-based Vehicle Routing

Reinforcement learning suffers from inefficiency when the number of potential solutions to be searched is large. This paper describes a method of improving reinforcement learning by applying rule induction in multi-agent systems. Knowledge captured by learned rules is used to reduce search space in reinforcement learning, allowing it to shorten learning time. The method is particularly suitable for agents operating in dynamically changing environments, in which fast response to changes is required. The method has been tested in transportation logistics domain in which agents represent vehicles being routed in a simple road network. Experimental results indicate that in this domain the method performs better than traditional Q-learning, as indicated by statistical comparison.

[1]  Wojciech Kotlowski,et al.  Maximum likelihood rule ensembles , 2008, ICML '08.

[2]  Julio Cesar Nievola,et al.  Statistical and Biological Validation Methods in Cluster Analysis of Gene Expression , 2007, ICMLA 2007.

[3]  Allen Newell,et al.  Human Problem Solving. , 1973 .

[4]  Kenneth A. Kaufman,et al.  The AQ21 Natural Induction Program for Pattern Discovery: Initial Version and its Novel Features , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[5]  R. Michalski Attributional Calculus: A Logic and Representation Language for Natural Induction , 2004 .

[6]  Bartlomiej Sniezynski,et al.  Agent Strategy Generation by Rule Induction in Predator-Prey Problem , 2009, ICCS.

[7]  Jan D. Gehrke,et al.  Traffic Prediction for Agent Route Planning , 2008, ICCS.

[8]  Stéphane Airiau,et al.  Incorporating Learning in BDI Agents , 2008 .

[9]  Annie S. Wu,et al.  Evolving control for distributed micro air vehicles , 1999, Proceedings 1999 IEEE International Symposium on Computational Intelligence in Robotics and Automation. CIRA'99 (Cat. No.99EX375).

[10]  Lynne E. Parker,et al.  Multi-Robot Learning in a Cooperative Observation Task , 2000, DARS.

[11]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[12]  Jerzy W. Grzymala-Busse,et al.  A New Version of the Rule Induction System LERS , 1997, Fundam. Informaticae.

[13]  Jacek Malec,et al.  Learning to evaluate conditional partial plans , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[14]  Lynne E. Parker,et al.  A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains , 2005, J. Intell. Robotic Syst..

[15]  C. Lee Giles,et al.  Learning Communication for Multi-agent Systems , 2002, WRAC.

[16]  Sandip Sen,et al.  Learning in multiagent systems , 1999 .

[17]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[18]  Sandip Sen,et al.  Evolving Beharioral Strategies in Predators and Prey , 1995, Adaption and Learning in Multi-Agent Systems.

[19]  Jan D. Gehrke,et al.  Designing a Simulation Middleware for FIPA Multiagent Systems , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[20]  Toshiharu Sugawara,et al.  On-Line Learning of Coordination Plans , 1993 .

[21]  Jing Peng,et al.  Incremental multi-step Q-learning , 1994, Machine Learning.

[22]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.