Smart building real time pricing for offering load-side Regulation Service reserves

Provision of Regulation Service (RS) reserves to Power Markets by smart building demand response has attracted attention in recent literature. This paper develops tractable dynamic optimal pricing algorithms for distributed RS reserve provision. It shows monotonicity and convexity properties of the optimal pricing policies and the associated differential cost function. Then, it uses them to propose and implement a modified Least Squares Temporal Differences (LSTD) Actor-Critic algorithm with a bounded and continuous action space. This algorithm solves for the best policy within a pre-specified broad family. In addition, the paper develops a novel Approximate Policy Iteration (API) algorithm and uses it successfully to optimize the parameters of an analytic policy function. Numerical results are obtained to demonstrate and compare the Actor-Critic and Approximate Policy Iteration algorithms, demonstrating that the novel API algorithm outperforms the Bounded LSTD Actor-Critic algorithm in both computational effort and policy minimum cost.

[1]  Jian Ma,et al.  Operational Impacts of Wind Generation on California Power Systems , 2009, IEEE Transactions on Power Systems.

[2]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[3]  M. Caramanis,et al.  Demand-Side Management for Regulation Service Provisioning Through Internal Pricing , 2012, IEEE Transactions on Power Systems.

[4]  Evan L. Porteus Conditions for characterizing the structure of optimal strategies in infinite-horizon dynamic programs , 1982 .

[5]  Christos G. Cassandras,et al.  Provision of regulation service reserves by flexible distributed loads , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[6]  Ioannis Ch. Paschalidis,et al.  A market-based mechanism for providing demand-side regulation service reserves , 2011, IEEE Conference on Decision and Control and European Control Conference.

[7]  Kenji Doya,et al.  Efficient Nonlinear Control with Actor-Tutor Architecture , 1996, NIPS.

[8]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[9]  Albert Y. Ha Optimal Dynamic Scheduling Policy for a Make-To-Stock Production System , 1997, Oper. Res..

[10]  Ioannis Ch. Paschalidis,et al.  A least squares temporal difference actor–critic algorithm with applications to warehouse management , 2012 .

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Michael C. Caramanis,et al.  Decision support for offering load-side Regulation Service reserves in competitive power markets , 2013, 52nd IEEE Conference on Decision and Control.

[13]  Shigenobu Kobayashi,et al.  Reinforcement learning of walking behavior for a four-legged robot , 2001, Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228).

[14]  John Baillieul,et al.  A packetized direct load control mechanism for demand side management , 2013, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[15]  Vijay R. Konda,et al.  OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..