论文信息 - Index Policies for Demand Response

Index Policies for Demand Response

Demand response programs incentivize loads to actively moderate their energy consumption to aid the power system. Uncertainty is an intrinsic aspect of demand response because a load's capability is often unknown until the load has been deployed. Algorithms must therefore balance utilizing well-characterized, good loads and learning about poorly characterized but potentially good loads; this is a manifestation of the classical tradeoff between exploration and exploitation. We address this tradeoff in a restless bandit framework, a generalization of the well-known multi-armed bandit problem. The formulation yields index policies in which loads are ranked by a scalar index, and those with the highest are deployed. The policy is particularly appropriate for demand response because the indices have explicit analytical expressions that may be evaluated separately for each load, making them both simple and scalable. This formulation serves as a heuristic basis for when only the aggregate effect of demand response is observed, from which the state of each individual load must be inferred. We formulate a tractable, analytical approximation for individual state inference based on observations of aggregate load curtailments. In numerical examples, the restless bandit policy outperforms the greedy policy by 5%-10% of the total cost. When the states of deployed loads are inferred from aggregate measurements, the resulting performance degradation is on the order of a few percent for the (now heuristic) restless bandit policy.

Joshua A. Taylor | J. Mathieu

[1] L. L. Cam,et al. An approximation theorem for the Poisson binomial distribution. , 1960 .

[2] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[3] Louis H. Y. Chen. Poisson Approximation for Dependent Trials , 1975 .

[4] P. Whittle. Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.

[5] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[6] John N. Tsitsiklis,et al. The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..

[7] Dimitris Bertsimas,et al. Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic , 2000, Oper. Res..

[8] J. Niño-Mora. RESTLESS BANDITS, PARTIAL CONSERVATION LAWS AND INDEXABILITY , 2001 .

[9] Neil J. Gordon,et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[10] X. Jin. Factor graphs and the Sum-Product Algorithm , 2002 .

[11] Vijay V. Vazirani,et al. Approximation Algorithms , 2001, Springer Berlin Heidelberg.