Residential HVAC Aggregation Based on Risk-averse Multi-armed Bandit Learning for Secondary Frequency Regulation

As the penetration of renewable energy continues to increase, stochastic and intermittent generation resources gradually replace the conventional generators, bringing significant challenges in stabilizing power system frequency. Thus, aggregating demand-side resources for frequency regulation attracts attentions from both academia and industry. However, in practice, conventional aggregation approaches suffer from random and uncertain behaviors of the users such as opting out control signals. The risk-averse multi-armed bandit learning approach is adopted to learn the behaviors of the users and a novel aggregation strategy is developed for residential heating, ventilation, and air conditioning (HVAC) to provide reliable secondary frequency regulation. Compared with the conventional approach, the simulation results show that the risk-averse multi-armed bandit learning approach performs better in secondary frequency regulation with fewer users being selected and opting out of the control. Besides, the proposed approach is more robust to random and changing behaviors of the users.