A Sentence is Worth a Thousand Pictures: Can Large Language Models Understand Human Language?