Multimodal foundation models are better simulators of the human brain