Data Augmentation with GAN: Improving Chest X-Ray Pathologies Prediction on Class-Imbalanced Cases

When one applies machine learning to a real-world problem, sometimes data imbalance makes a crucial impact on the resulting model’s performance. We propose to use generative adversarial network (GAN) to do data balancing through data augmentation in data preprocessing step of binary classification task. We train CycleGAN on unpaired images to be able to produce images from the opposite class for any given input image. After training we use it to produce images from the opposite class for every image in a given imbalanced dataset, thus making it fully-balanced.