A reinforcement learning approach to autonomous decision-making in smart electricity markets

The vision of a Smart Electric Grid relies critically on substantial advances in intelligent decentralized control mechanisms. We propose a novel class of autonomous broker agents for retail electricity trading that can operate in a wide range of Smart Electricity Markets, and that are capable of deriving long-term, profit-maximizing policies. Our brokers use Reinforcement Learning with function approximation, they can accommodate arbitrary economic signals from their environments, and they learn efficiently over the large state spaces resulting from these signals. We show how feature selection and regularization can be leveraged to automatically optimize brokers for particular market conditions, and demonstrate the performance of our design in extensive experiments using real-world energy market data.

[1]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[2]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[3]  Gunar E. Liepins,et al.  Genetic algorithms: Foundations and applications , 1990 .

[4]  Wolfgang Ketter,et al.  Autonomous Data-Driven Decision-Making in Smart Electricity Markets , 2012, ECML/PKDD.

[5]  Wolfgang Ketter,et al.  The 2016 Power Trading Agent Competition , 2012 .

[6]  Nicholas R. Jennings,et al.  Computational-Mechanism Design: A Call to Arms , 2003, IEEE Intell. Syst..

[7]  Manuela M. Veloso,et al.  Strategy Learning for Autonomous Agents in Smart Grid Markets , 2011, IJCAI.

[8]  Andrew Y. Ng,et al.  Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.

[9]  Manuela M. Veloso,et al.  Learned Behaviors of Multiple Autonomous Agents in Smart Grid Markets , 2011, AAAI.

[10]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[11]  Peter Stone,et al.  Adaptive Auction Mechanism Design and the Incorporation of Prior Knowledge , 2010, INFORMS J. Comput..

[12]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[13]  Paul J. Werbos,et al.  Putting more brain-like intelligence into the electric power grid: What we need and how to do it , 2009, 2009 International Joint Conference on Neural Networks.

[14]  Ronald E. Parr,et al.  L1 Regularized Linear Temporal Difference Learning , 2012 .

[15]  Alessandro Lazaric,et al.  LSTD with Random Projections , 2010, NIPS.

[16]  Leigh Tesfatsion,et al.  Market power and efficiency in a computational electricity market with discriminatory double-auction pricing , 2001, IEEE Trans. Evol. Comput..

[17]  R. Munos,et al.  LSPI with Random Projections , 2010 .

[18]  Martin Bichler,et al.  Designing smart markets , 2010 .

[19]  Wolfgang Ketter,et al.  Demand side management—A simulation of household behavior under variable prices , 2011 .

[20]  Catherine Waddams Price,et al.  Do Consumers Switch to the Best Supplier , 2007 .

[21]  D. Parkes Algorithmic Game Theory: Online Mechanisms , 2007 .

[22]  Shimon Whiteson,et al.  Protecting against evaluation overfitting in empirical reinforcement learning , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[23]  K. De Jong Learning with Genetic Algorithms: An Overview , 1988 .

[24]  M. Loth,et al.  Sparse Temporal Difference Learning Using LASSO , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[25]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[26]  Charles Elkan,et al.  Policy Iteration Based on a Learned Transition Model , 2012, ECML/PKDD.

[27]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[28]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[29]  Lihong Li,et al.  An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.

[30]  Habib Rajabi Mashhadi,et al.  An Adaptive $Q$-Learning Algorithm Developed for Agent-Based Computational Modeling of Electricity Market , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[31]  Larry D. Pyeatt,et al.  Decision Tree Function Approximation in Reinforcement Learning , 1999 .

[32]  Marek Petrik,et al.  Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes , 2010, ICML.

[33]  M. M. De Weerdt,et al.  Pricing mechanism for real-time balancing in regional electricity markets , 2011, TADA 2011.

[34]  Andrew G. Barto,et al.  Linear Least-Squares Algorithms for Temporal Difference Learning , 2005, Machine Learning.

[35]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[36]  Risto Miikkulainen,et al.  Automatic feature selection in neuroevolution , 2005, GECCO '05.

[37]  A. Rosenfeld,et al.  An exploratory analysis of California residential customer response to critical peak pricing of electricity , 2007 .

[38]  Martin Bichler,et al.  Research Commentary - Designing Smart Markets , 2010, Inf. Syst. Res..

[39]  J. Contreras,et al.  Forecasting electricity prices for a day-ahead pool-based electric energy market , 2005 .

[40]  Kenneth DeJong,et al.  Learning with genetic algorithms: An overview , 1988, Machine Learning.

[41]  Maria L. Gini,et al.  Real-Time Tactical and Strategic Sales Management for Intelligent Agents Guided by Economic Regimes , 2008, Inf. Syst. Res..

[42]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .