Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks