Once bitten, twice shy: The overgeneralization trap and epistemic learning after policy failure