Reward modeling for mitigating toxicity in transformer-based language models