VidyutVanika: A Reinforcement Learning Based Broker Agent for a Power Trading Competition

A smart grid is an efficient and sustainable energy system that integrates diverse generation entities, distributed storage capacity, and smart appliances and buildings. A smart grid brings new kinds of participants in the energy market served by it, whose effect on the grid can only be determined through high fidelity simulations. Power TAC offers one such simulation platform using real-world weather data and complex state-of-the-art customer models. In Power TAC, autonomous energy brokers compete to make profits across tariff, wholesale and balancing markets while maintaining the stability of the grid. In this paper, we design an autonomous broker VidyutVanika, the runner-up in the 2018 Power TAC competition. VidyutVanika relies on reinforcement learning (RL) in the tariff market and dynamic programming in the wholesale market to solve modified versions of known Markov Decision Process (MDP) formulations in the respective markets. The novelty lies in defining the reward functions for MDPs, solving these MDPs, and the application of these solutions to real actions in the market. Unlike previous participating agents, VidyutVanika uses a neural network to predict the energy consumption of various customers using weather data. We use several heuristic ideas to bridge the gap between the restricted action spaces of the MDPs and the much more extensive action space available to VidyutVanika. These heuristics allow VidyutVanika to convert near-optimal fixed tariffs to time-of-use tariffs aimed at mitigating transmission capacity fees, spread out its orders across several auctions in the wholesale market to procure energy at a lower price, more accurately estimate parameters required for implementing the MDP solution in the wholesale market, and account for wholesale procurement costs while optimizing tariffs. We use Power TAC 2018 tournament data and controlled experiments to analyze the performance of VidyutVanika, and illustrate the efficacy of the above strategies.

[1]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[2]  Manuela M. Veloso,et al.  Strategy Learning for Autonomous Agents in Smart Grid Markets , 2011, IJCAI.

[3]  Daniel Urieli,et al.  An MDP-Based Winning Approach to Autonomous Power Trading: Formalization and Empirical Analysis , 2016, AAAI Workshop: AI for Smart Grids and Smart Buildings.

[4]  R. Unland,et al.  AgentUDE 17 : Imbalance Management of a Retailer Agent to Exploit Balancing Market Incentives in a Smart Grid Ecosystem , 2018 .

[5]  Christopher Kiekintveld,et al.  Investigation of Learning Strategies for the SPOT Broker in Power TAC , 2015, AMEC/TADA.

[6]  Wolfgang Ketter,et al.  The Power Trading Agent Competition , 2011 .

[7]  Daniel Urieli,et al.  TacTex'13: A Champion Adaptive Power Trading Agent , 2014, AAAI.

[8]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[9]  Rainer Unland,et al.  AgentUDE17: A Genetic Algorithm to Optimize the Parameters of an Electricity Tariff in a Smart Grid Environment , 2018, PAAMS.

[10]  Rainer Unland,et al.  Autonomous Power Trading Approaches of a Winner Broker , 2016, AMEC/TADA.

[11]  Tran Cao Son,et al.  Bidding Strategy for Periodic Double Auctions Using Monte Carlo Tree Search , 2018, AAMAS.

[12]  Goran Strbac,et al.  Recurrent Deep Multiagent Q-Learning for Autonomous Brokers in Smart Grid , 2018, IJCAI.

[13]  Gerald Tesauro,et al.  Strategic sequential bidding in auctions using dynamic programming , 2002, AAMAS '02.

[14]  B. Speer,et al.  Role of Smart Grids in Integrating Renewable Energy , 2015 .

[15]  Daniel Urieli,et al.  Autonomous Electricity Trading Using Time-of-Use Tariffs in a Competitive Market , 2016, AAAI.

[16]  Moinul Morshed Porag Chowdhury Predicting Prices in the Power TAC Wholesale Energy Market , 2016, AAAI.

[17]  Ana Paula Rocha,et al.  TugaTAC Broker: A Fuzzy Logic Adaptive Reasoning Agent for Energy Trading , 2015, EUMAS/AT.

[18]  Han La Poutré,et al.  A Successful Broker Agent for Power TAC , 2014, AMEC/TADA.

[19]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[20]  Jonathan Serrano Cuevas,et al.  Fixed-Price Tariff Generation Using Reinforcement Learning , 2017 .