MaXM: Towards Multilingual Visual Question Answering