Negotiated Learning for Smart Grid Agents: Entity Selection based on Dynamic Partially Observable Features

An attractive approach to managing electricity demand in the Smart Grid relies on real-time pricing (RTP) tariffs, where customers are incentivized to quickly adapt to changes in the cost of supply. However, choosing amongst competitive RTP tariffs is difficult when tariff prices change rapidly. The problem is further complicated when we assume that the price changes for a tariff are published in real-time only to those customers who are currently subscribed to that tariff, thus making the prices partially observable. We present models and learning algorithms for autonomous agents that can address the tariff selection problem on behalf of customers. We introduce Negotiated Learning, a general algorithm that enables a self-interested sequential decision-making agent to periodically select amongst a variable set of entities (e.g., tariffs) by negotiating with other agents in the environment to gather information about dynamic partially observable entity features (e.g., tariff prices) that affect the entity selection decision. We also contribute a formulation of the tariff selection problem as a Negotiable Entity Selection Process, a novel representation. We support our contributions with intuitive justification and simulation experiments based on real data on an open Smart Grid simulation platform.

[1]  Manuela M. Veloso,et al.  Oracular Partially Observable Markov Decision Processes: A Very Special Case , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[2]  Sarvapali D. Ramchurn,et al.  Agent-based control for decentralised demand side management in the smart grid , 2011, AAMAS.

[3]  Ahmad Faruqui,et al.  Dynamic Pricing and Its Discontents , 2011 .

[4]  Jeff G. Schneider,et al.  Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[5]  Manuela Veloso,et al.  Execution-time communication decisions for coordination of multi-agent teams , 2007 .

[6]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[7]  Goran Strbac,et al.  Demand side management: Benefits and challenges ☆ , 2008 .

[8]  Sarvapali D. Ramchurn,et al.  Agent-based micro-storage management for the Smart Grid , 2010, AAMAS.

[9]  Michael L. Littman,et al.  An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..

[10]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[11]  Michael I. Jordan,et al.  Reinforcement Learning with Soft State Aggregation , 1994, NIPS.

[12]  Galen Barbose,et al.  A survey of utility experience with real time pricing: implications for policymakers seeking price responsive demand , 2005 .

[13]  Koen Kok,et al.  Multi-agent coordination in the electricity grid, from concept towards market introduction , 2010, AAMAS.

[14]  Colin Camerer,et al.  Experience‐weighted Attraction Learning in Normal Form Games , 1999 .

[15]  Wolfgang Ketter,et al.  A Multi-Agent Energy Trading Competition , 2009 .

[16]  Dean P. Foster,et al.  Regret in the On-Line Decision Problem , 1999 .

[17]  Manuela M. Veloso,et al.  Learning of coordination: exploiting sparse interactions in multiagent systems , 2009, AAMAS.

[18]  Manuela M. Veloso,et al.  Negotiation in Semi-cooperative Agreement Problems , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[19]  Manuela M. Veloso,et al.  Factored Models for Multiscale Decision-Making in Smart Grid Customers , 2012, AAAI.

[20]  B. Howe,et al.  The future's smart delivery system [electric power supply] , 2004, IEEE Power and Energy Magazine.

[21]  C. Gomes Computational Sustainability: Computational methods for a sustainable environment, economy, and society , 2009 .

[22]  Manuela M. Veloso,et al.  Learned Behaviors of Multiple Autonomous Agents in Smart Grid Markets , 2011, AAAI.

[23]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[24]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[25]  Colin F. Camerer,et al.  Behavioral Game Theory and the Neural Basis of Strategic Choice , 2009 .

[26]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[27]  Olof M. Jarvegren,et al.  Pacific Northwest GridWise™ Testbed Demonstration Projects; Part I. Olympic Peninsula Project , 2008 .

[28]  D. Fudenberg,et al.  Conditional Universal Consistency , 1999 .

[29]  Victor R. Lesser,et al.  Coalition Formation among Bounded Rational Agents , 1995, IJCAI.