We examine the applicability of fuzzy Q-learning to a multi-player non-cooperative repeated game. First, we formulate a transportation problem as a repeated game, where many agents (i.e., many game players) compete with one another at several markets. Each agent is supposed to choose one market for maximizing his own profit obtained by selling his product at that market. It is assumed in our game that the market price of the product is determined by the demand-supply relation at each market. After formulating the repeated game, we explain how Q-learning can be employed by each agent for choosing a market. Then the Q-learning is extended to fuzzy Q-learning for utilizing the information about the previous market prices when each agent chooses a market. The previous price of each market is represented by two fuzzy linguistic values "low" and "high". By computer simulations on a numerical example with 100 agents and five markets, we clearly show that the fuzzy Q-learning can learn effective strategies as fuzzy If-Then rules for choosing a market.
[1]
D. E. Goldberg,et al.
Genetic Algorithms in Search
,
1989
.
[2]
T. Horiuchi,et al.
Fuzzy interpolation-based Q-learning with continuous states and actions
,
1996,
Proceedings of IEEE 5th International Fuzzy Systems.
[3]
H. R. Berenji,et al.
Fuzzy Q-learning: a new approach for fuzzy dynamic programming
,
1994,
Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.
[4]
David E. Goldberg,et al.
Genetic Algorithms in Search Optimization and Machine Learning
,
1988
.
[5]
John H. Holland,et al.
Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence
,
1992
.
[6]
Akira Ito,et al.
The Emergence of Cooperation in a Society of Autonomous Agents - The Prisoner's Dilemma Game under the Disclosure of Contract Histories
,
1995,
ICMAS.