Mitigating Metastable Failures in Distributed Systems with Offline Reinforcement Learning