Why Do Neural Language Models Still Need Commonsense Knowledge to Handle Semantic Variations in Question Answering?