CLIP Models are Few-Shot Learners: Empirical Studies on VQA and Visual Entailment