Exploiting generative self-supervised learning for the assessment of biological images with lack of annotations: a COVID-19 case-study

Computer-aided analysis of biological images typically requires extensive training on large-scale annotated datasets, which is not viable in many situations. In this paper, we present GAN-DL, a Discriminator Learner based on the StyleGAN2 architecture, which we employ for self-supervised image representation learning in the case of fluorescent biological images. We show that Wasserstein Generative Adversarial Networks combined with linear Support Vector Machines enable high-throughput compound screening based on raw images. We demonstrate this by classifying active and inactive compounds tested for the inhibition of SARS-CoV-2 infection in VERO and HRCE cell lines. In contrast to previous methods, our deep learning-based approach does not require any annotation besides the one that is normally collected during the sample preparation process. We test our technique on the RxRx19a Sars-CoV-2 image collection. The dataset consists of fluorescent images that were generated to assess the ability of regulatory-approved or late-stage clinical trials compounds to modulate the in vitro infection from SARS-CoV-2 in both VERO and HRCE cell lines. We show that our technique can be exploited not only for classification tasks but also to effectively derive a dose-response curve for the tested treatments, in a self-supervised manner. Lastly, we demonstrate its generalization capabilities by successfully addressing a zero-shot learning task, consisting of the categorization of four different cell types of the RxRx1 fluorescent images collection.

[1]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Bharath Hariharan,et al.  Extending and Analyzing Self-Supervised Learning Across Domains , 2020, ECCV.

[3]  Anne E Carpenter,et al.  Functional immune mapping with deep-learning enabled phenomics applied to immunomodulatory and COVID-19 drug discovery , 2020, bioRxiv.

[4]  Zhaoyu Su,et al.  Is Discriminator a Good Feature Extractor? , 2019, ArXiv.

[5]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[6]  Asja Fischer,et al.  On the regularization of Wasserstein GANs , 2017, ICLR.

[7]  W2WNet: a two-module probabilistic Convolutional Neural Network with embedded data cleansing functionality , 2021, 2103.13107.

[8]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[9]  Sven Lončarić,et al.  Retinal OCT Image Segmentation: How Well do Algorithms Generalize or How Transferable are the Data? , 2020, 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO).

[10]  Yolanda T. Chong,et al.  Identification of potential treatments for COVID-19 through artificial intelligence-enabled phenomic analysis of human cells infected with SARS-CoV-2 , 2020, bioRxiv.

[11]  Jie Tang,et al.  Self-Supervised Learning: Generative or Contrastive , 2020, IEEE Transactions on Knowledge and Data Engineering.

[12]  Seungtaek Kim,et al.  Comparative analysis of antiviral efficacy of FDA‐approved drugs against SARS‐CoV‐2 in human lung cells , 2020, Journal of medical virology.

[13]  Jun Li,et al.  Unsupervised Feature Extraction in Hyperspectral Images Based on Wasserstein Generative Adversarial Network , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Yang Wang,et al.  MARTA GANs: Unsupervised Representation Learning for Remote Sensing Image Classification , 2016, IEEE Geoscience and Remote Sensing Letters.

[15]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[16]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.