论文信息 - VidyutVanika: A Reinforcement Learning Based Broker Agent for a Power Trading Competition

VidyutVanika: A Reinforcement Learning Based Broker Agent for a Power Trading Competition

A smart grid is an efficient and sustainable energy system that integrates diverse generation entities, distributed storage capacity, and smart appliances and buildings. A smart grid brings new kinds of participants in the energy market served by it, whose effect on the grid can only be determined through high fidelity simulations. Power TAC offers one such simulation platform using real-world weather data and complex state-of-the-art customer models. In Power TAC, autonomous energy brokers compete to make profits across tariff, wholesale and balancing markets while maintaining the stability of the grid. In this paper, we design an autonomous broker VidyutVanika, the runner-up in the 2018 Power TAC competition. VidyutVanika relies on reinforcement learning (RL) in the tariff market and dynamic programming in the wholesale market to solve modified versions of known Markov Decision Process (MDP) formulations in the respective markets. The novelty lies in defining the reward functions for MDPs, solving these MDPs, and the application of these solutions to real actions in the market. Unlike previous participating agents, VidyutVanika uses a neural network to predict the energy consumption of various customers using weather data. We use several heuristic ideas to bridge the gap between the restricted action spaces of the MDPs and the much more extensive action space available to VidyutVanika. These heuristics allow VidyutVanika to convert near-optimal fixed tariffs to time-of-use tariffs aimed at mitigating transmission capacity fees, spread out its orders across several auctions in the wholesale market to procure energy at a lower price, more accurately estimate parameters required for implementing the MDP solution in the wholesale market, and account for wholesale procurement costs while optimizing tariffs. We use Power TAC 2018 tournament data and controlled experiments to analyze the performance of VidyutVanika, and illustrate the efficacy of the above strategies.