Behavioral control task supervisor with memory based on reinforcement learning for human—multi-robot coordination systems