Generating Images with Multimodal Language Models