One proposed approach to managing a large complex Smart Grid is through Broker Agents who buy electrical power from distributed producers, and also sell power to consumers, via a Tariff Market—a new market mechanism where Broker Agents publish concurrent bid and ask prices. A key challenge is the specification of the market strategy that the Broker Agents should use in order to earn profits while maintaining the market's balance of supply and demand. Interestingly, previous work has shown that a Broker Agent can learn its strategy, using Markov Decision Processes (MDPs) and Q-learning, and outperform other Broker Agents that use predetermined or randomized strategies. In this work, we investigate the more representative scenario in which multiple Broker Agents, instead of a single one, are independently learning their strategies. Using a simulation environment based on real data, we find that Broker Agents who employ periodic increases in exploration achieve higher rewards. We also find that varying levels of market dominance in customer allocation models result in remarkably distinct outcomes in market prices and aggregate Broker Agent rewards. The latter set of results can be explained by established economic principles regarding the emergence of monopolies in market-based competition, further validating our approach.
[1]
Anind K. Dey,et al.
Maximum Causal Entropy Correlated Equilibria for Markov Games
,
2011,
Interactive Decision Theory and Game Theory.
[2]
Wolfgang Ketter,et al.
Smart Grid Economics: Policy Guidance Through Competitive Simulation
,
2010
.
[3]
Manuela M. Veloso,et al.
Strategy Learning for Autonomous Agents in Smart Grid Markets
,
2011,
IJCAI.
[4]
Chuanhua Zeng,et al.
An Adaptive Approach for the Exploration-Exploitation Dilemma in Non-stationary Environment
,
2008,
2008 International Conference on Computer Science and Software Engineering.
[5]
Manuela M. Veloso,et al.
Decentralized MDPs with sparse interactions
,
2011,
Artif. Intell..