Few-Shot Preference Learning for Human-in-the-Loop RL