Batch reinforcement learning for network-safe demand response in unknown electric grids