Energy-efficient control of thermal comfort in multi-zone residential HVAC via reinforcement learning