The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training