Disentangling Information in Artificial Images of Plant Seedlings Using Semi-Supervised GAN

Lack of annotated data for training of deep learning systems is a challenge for many visual recognition tasks. This is especially true for domain-specific applications, such as plant detection and recognition, where the annotation process can be both time-consuming and error-prone. Generative models can be used to alleviate this issue by producing artificial data that mimic properties of real data. This work presents a semi-supervised generative adversarial network (GAN) model to produce artificial samples of plant seedlings. By applying the semi-supervised approach, we are able to produce visually distinct samples for nine unique plant species using a single GAN model, while still maintaining a relatively high visual variance in the produced samples for each species. Additionally, we are able to control the appearance of the generated samples with respect to rotation and size through a set of latent variables, despite these not being annotated features in the training data. The generated samples resemble the intended species with an average recognition accuracy of ∼64.3%, evaluated using an external state-of-the-art plant seedling classification model. Additionally, we explore the potential of using the GAN model’s discriminator as a quality assessment tool to remove poor representations of plant seedlings from the artificial samples.