A Proposal of Q-learning for applying Experimental knowledge about The unknown environment on Heterogeneous Multi-agent system

The purpose of our research is the acquisition of cooperative behaviors in a heterogeneous multi-agent system. In this paper, we used the fire panic problem with the objective of having agents learn how to escape during a fire. Also, we have observed agents behavior and checked a character of agent types. In the fire panic problem, a fire exists in a field, and progresses step, spreading. The objective of the agent is to reach a goal position without touching the fire. The fire is able to grow three power levels during a few steps in a field. The agent has to run away from the fire and get a goal, an exit. Our purpose in this paper is the acquisition of behavior to reach a goal in a changing environment. We propose new Q-learning based on CMAC, which has special functions to add elements. The target filed is made from these elements: each agent, exit points and walls. When designers make normal designs of Q-learning for this type of problem, agents select random action about first element, which come up a view area of agent. The proposed learning system has a multi-layer style rule base. Those several layers have different members of the element. If one rule base has no valid rule, other rule is able to output the incomplete actions by slightly valid elements of the rule base. However those actions are not random walking. If the agent receives penalty about that action as wrong action, the Q-learning system makes changes in the current layer and adds new elements. The proposed Q-learning system continues learning and adds the new element and changes the next layer when the agents has observed new elements. In this paper, we will discuss our proposed idea and show results on the fire panic problem.