Fine-Tuning Language Models from Human Preferences