A Negotiation-Based Genetic Framework for Multi-Agent Credit Assignment

Multi agent systems are a well-defined solution for implementing dynamic complex environments. One of the open issues of these systems is credit assignment problem. The main concern of credit assignment problem is to properly distributing feedback of overall performance, and brings about learning in each individual agent. In this paper a genetic framework for solving Multi-agent credit assignment problem is proposed. Our framework, Negotiation Based Credit Assignment, NBCA, applies negotiation for both enriching agents' knowledge as well as organizing populations by a mode analyzer. The proposed architecture includes a mentor agent which responsible for credit assignment without any context related information leading to a general solution. Furthermore, the mentor agent does not receive any information regarding correctness of a particular agent's behavior. Carry and non-Carry cases have been considered for evaluating this method. In addition, the effects of noise as a source of uncertainty on NBCA performance are examined. Our finding indicated that the proposed method is superior to previous credit assignment approaches. This is due to the argumentation and negotiation features of multi agent systems that are used to accomplish team learning and credit assignment respectively. The analysis of obtained results which are theoretically discussed, demonstrate that, in comparison with KEBCA (OR-type), our approach performs better than KEBCA after 5000 trials in 0% noisy environment. However, it performs worse than KEBCA in 10% and 30% noisy environment.

[1]  Majid Nili Ahmadabadi,et al.  A new approach to credit assignment in a team of cooperative Q-learning agents , 2002, IEEE International Conference on Systems, Man and Cybernetics.

[2]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[3]  Majid Nili Ahmadabadi,et al.  Knowledge-Based Multiagent Credit Assignment: A Study on Task Type and Critic Information , 2007, IEEE Systems Journal.

[4]  Nicholas R. Jennings,et al.  Negotiation in multi-agent systems , 1999, The Knowledge Engineering Review.

[5]  Kagan Tumer,et al.  Unifying temporal and structural credit assignment problems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[6]  Jagath C. Rajapakse,et al.  Neural Information Processing: Research and Development , 2004 .

[7]  Sachiyo Arai,et al.  Multi-agent reinforcement learning for crane control problem: designing rewards for conflict resolution , 1999, Proceedings. Fourth International Symposium on Autonomous Decentralized Systems. - Integration of Heterogeneous Systems -.

[8]  John H. Holland,et al.  Properties of the Bucket Brigade , 1985, ICGA.

[9]  Sandip Sen,et al.  Learning in multiagent systems , 1999 .

[10]  Qingfu Zhang,et al.  Adaptive Operator Selection With Bandits for a Multiobjective Evolutionary Algorithm Based on Decomposition , 2014, IEEE Transactions on Evolutionary Computation.

[11]  Riichiro Mizoguchi,et al.  PRICAI 2000 Topics in Artificial Intelligence , 2000, Lecture Notes in Computer Science.

[12]  Gabriela Ochoa,et al.  Evolvability metrics in adaptive operator selection , 2014, GECCO.

[13]  Hamid Beigy,et al.  Toward a Solution to Multi-agent Credit Assignment Problem , 2009, 2009 International Conference of Soft Computing and Pattern Recognition.

[14]  M. N. Ahmadabadi,et al.  Experimental Analysis of Knowledge Based Multiagent Credit Assignment , 2004 .

[15]  Zne-Jung Lee,et al.  A genetic algorithm based robust learning credit assignment cerebellar model articulation controller , 2004, Appl. Soft Comput..

[16]  Andrew W. Moore,et al.  Distributed Value Functions , 1999, ICML.

[17]  Wenji Mao,et al.  The Social Credit Assignment Problem , 2003, IVA.

[18]  Sachiyo Arai,et al.  Experience-Based Reinforcement Learning to Acquire Effective Behavior in a Multi-agent Domain , 2000, PRICAI.

[19]  Daniel Kudenko,et al.  Learning in multi-agent systems , 2001, The Knowledge Engineering Review.

[20]  David E. Goldberg,et al.  Are Multiple Runs of Genetic Algorithms Better than One? , 2003, GECCO.

[21]  Ashok K. Goel,et al.  Hierarchical Judgement Composition: Revisiting the Structural Credit Assignment Problem , 2004 .

[22]  Luis Alvarez,et al.  Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications , 2012, Lecture Notes in Computer Science.

[23]  Jürgen Branke,et al.  Evolutionary optimization in uncertain environments-a survey , 2005, IEEE Transactions on Evolutionary Computation.

[24]  Julian F. Miller,et al.  Genetic and Evolutionary Computation — GECCO 2003 , 2003, Lecture Notes in Computer Science.

[25]  Cynthia Breazeal,et al.  Training a Robot via Human Feedback: A Case Study , 2013, ICSR.

[26]  Leopoldo Altamirano Robles,et al.  Teaching a Robot to Perform Task through Imitation and On-line Feedback , 2011, CIARP.