Canvas GAN: Bootstrapped Image-Conditional Models

Generative Adversarial Networks (GANs) learn generating functions that map a random noise distribution $Z$ to a target data distribution. Usually, little attention is paid to the form of $Z$, resulting in nearly all models assuming $Z$ follows a convenient parametric form (almost always $Z\ \sim \ Normal (0,1)$ or $Z\ \sim\ Uniform(-1,1))$. However, we observe that image-conditional generators, such as those used in the CycleGAN, produce better quality generation than comparable single domain GANs can currently achieve. This is true even when the images being conditioned upon are substantially different from the target images. We hypothesize these models benefit from input which already has the general structure of images, even if their semantic content is different. As a result, we propose the Canvas GAN: using just a small handful of real images (“canvases”), we create random input for an image-conditional generator by randomly cropping, flipping along either or both axis, coloring, and resizing the canvas. These diverse samples allow the generator to edit an input that already exhibits natural image structure, as opposed to having to generate it from scratch from independent white noise.