Ensembles of GANs for synthetic training data generation

Insufficient training data is a major bottleneck for most deep learning practices, not least in medical imaging where data is difficult to collect and publicly available datasets are scarce due to ethics and privacy. This work investigates the use of synthetic images, created by generative adversarial networks (GANs), as the only source of training data. We demonstrate that for this application, it is of great importance to make use of multiple GANs to improve the diversity of the generated data, i.e. to sufficiently cover the data distribution. While a single GAN can generate seemingly diverse image content, training on this data in most cases lead to severe over-fitting. We test the impact of ensembled GANs on synthetic 2D data as well as common image datasets (SVHN and CIFAR-10), and using both DCGANs and progressively growing GANs. As a specific use case, we focus on synthesizing digital pathology patches to provide anonymized training data.

[1]  Trung Le,et al.  MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.

[2]  Meyke Hermsen,et al.  1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset , 2018, GigaScience.

[3]  Hayit Greenspan,et al.  GAN-based Synthetic Medical Image Augmentation for increased CNN Performance in Liver Lesion Classification , 2018, Neurocomputing.

[4]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[5]  Ashish Khetan,et al.  PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[6]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[7]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[8]  Joost van de Weijer,et al.  Ensembles of Generative Adversarial Networks , 2016, ArXiv.

[9]  Jeffrey L. Gunter,et al.  Medical Image Synthesis for Data Augmentation and Anonymization using Generative Adversarial Networks , 2018, SASHIMI@MICCAI.

[10]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[12]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[13]  Charles A. Sutton,et al.  VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.

[14]  Liang Chen,et al.  GAN Augmentation: Augmenting Training Data using Generative Adversarial Networks , 2018, ArXiv.

[15]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[16]  John T. Guibas,et al.  Synthetic Medical Images from Dual Generative Adversarial Networks , 2017, ArXiv.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[19]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[20]  Tanveer F. Syeda-Mahmood,et al.  Chest x-ray generation and data augmentation for cardiovascular abnormality classification , 2018, Medical Imaging.

[21]  David Bau,et al.  Diverse Image Generation via Self-Conditioned GANs , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Stefano Ermon,et al.  Boosted Generative Models , 2016, AAAI.

[23]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[24]  Ali Borji,et al.  Pros and Cons of GAN Evaluation Measures , 2018, Comput. Vis. Image Underst..

[25]  Boi Faltings,et al.  Generating Artificial Data for Private Deep Learning , 2018, 1803.03148.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jinsung Yoon,et al.  Anonymization Through Data Synthesis Using Generative Adversarial Networks (ADS-GAN) , 2020, IEEE Journal of Biomedical and Health Informatics.

[28]  Bernhard Schölkopf,et al.  AdaGAN: Boosting Generative Models , 2017, NIPS.

[29]  Cordelia Schmid,et al.  How good is my GAN? , 2018, ECCV.

[30]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.