Learning unsupervised feature representations for single cell microscopy images with paired cell inpainting

Cellular microscopy images contain rich insights about biology. To extract this information, researchers use features, or measurements of the patterns of interest in the images. Here, we introduce a convolutional neural network (CNN) to automatically design features for fluorescence microscopy. We use a selfsupervised method to learn feature representations of single cells in microscopy images without labelled training data. We train CNNs on a simple task that leverages the inherent structure of microscopy images and controls for variation in cell morphology and imaging: given one cell from an image, the CNN is asked to predict the fluorescence pattern in a second different cell from the same image. We show that our method learns high-quality features that describe protein expression patterns in single cells both yeast and human microscopy datasets. Moreover, we demonstrate that our features are useful for exploratory biological analysis, by capturing high-resolution cellular components in a proteome-wide cluster analysis of human proteins, and by quantifying multi-localized proteins and single-cell variability. We believe paired-cell inpainting is a generalizable method to obtain feature representations of single cells in multichannel microscopy images.

[1]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Anne E Carpenter,et al.  Reconstructing cell cycle and disease progression using deep learning , 2017, Nature Communications.

[3]  Yolanda T. Chong,et al.  Integrating images from multiple microscopy screens reveals diverse patterns of change in the subcellular localization of proteins , 2018, eLife.

[4]  Yolanda T. Chong,et al.  Automated analysis of high‐content microscopy data with deep learning , 2017, Molecular systems biology.

[5]  Paolo Favaro,et al.  Self-Supervised Feature Learning by Learning to Spot Artifacts , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Anne E Carpenter,et al.  Weakly Supervised Learning of Single-Cell Feature Embeddings , 2018, bioRxiv.

[7]  Zhuowen Tu,et al.  Weakly supervised histopathology cancer image segmentation and classification , 2014, Medical Image Anal..

[8]  Anne E Carpenter,et al.  Comparison of Methods for Image-Based Profiling of Cellular Morphological Responses to Small-Molecule Treatment , 2013, Journal of biomolecular screening.

[9]  Jennifer C. Waters,et al.  Accuracy and precision in quantitative fluorescence microscopy , 2009, The Journal of cell biology.

[10]  Motonori Ota,et al.  Multiple-Localization and Hub Proteins , 2016, PloS one.

[11]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[12]  Adrian J. Verster,et al.  High-Content Screening for Quantitative Cell Biology. , 2016, Trends in cell biology.

[13]  Yolanda T. Chong,et al.  CYCLoPs: A Comprehensive Database Constructed from Automated Analysis of Protein Abundance and Subcellular Localization Patterns in Saccharomyces cerevisiae , 2015, G3: Genes, Genomes, Genetics.

[14]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[15]  Galit Lahav,et al.  Dynamics of the DNA damage response: insights from live-cell imaging. , 2013, Briefings in functional genomics.

[16]  B. Snijder,et al.  Origins of regulated cell-to-cell variability , 2011, Nature Reviews Molecular Cell Biology.

[17]  Casper F Winsnes,et al.  Deep learning is combined with massive-scale citizen science to improve large-scale image classification , 2018, Nature Biotechnology.

[18]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[19]  Brendan J. Frey,et al.  Classifying and segmenting microscopy images with deep multiple instance learning , 2015, Bioinform..

[20]  Jieyue Li,et al.  Automated Learning of Subcellular Variation among Punctate Protein Patterns and a Generative Model of Their Relation to Microtubules , 2015, PLoS Comput. Biol..

[21]  Anne E Carpenter,et al.  CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[22]  Robert F. Murphy,et al.  Quantifying the distribution of probes between subcellular locations using unsupervised pattern unmixing , 2010, Bioinform..

[23]  Beate Sick,et al.  Single-Cell Phenotype Classification Using Deep Convolutional Neural Networks , 2016, Journal of biomolecular screening.

[24]  Alexander Kolesnikov,et al.  Revisiting Self-Supervised Visual Representation Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ramon Grima,et al.  Single-cell variability in multicellular organisms , 2018, Nature Communications.

[26]  Yolanda T. Chong,et al.  Yeast Proteome Dynamics from Single Cell Imaging and Automated Analysis , 2015, Cell.

[27]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[28]  Barry Y. Chen,et al.  Improvements to Context Based Self-Supervised Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  In-So Kweon,et al.  Learning Image Representations by Completing Damaged Jigsaw Puzzles , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[30]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[31]  Bo Huang,et al.  A scalable strategy for high-throughput GFP tagging of endogenous human proteins , 2016, Proceedings of the National Academy of Sciences.

[32]  Jieyue Li,et al.  Automated Analysis and Reannotation of Subcellular Locations in Confocal Images from the Human Protein Atlas , 2012, PloS one.

[33]  Armaghan W. Naik,et al.  Point process models for localization and interdependence of punctate cellular structures , 2016, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[34]  Mary M. Maleckar,et al.  Building a 3D Integrated Cell , 2017, bioRxiv.

[35]  Xian Zhang,et al.  A multi‐scale convolutional neural network for phenotyping high‐content cellular images , 2017, Bioinform..

[36]  Emmanuelle Gouillart,et al.  scikit-image: image processing in Python , 2014, PeerJ.

[37]  S. Uchida Image processing and recognition for biological images , 2013, Development, growth & differentiation.

[38]  Lassi Paavolainen,et al.  Data-analysis strategies for image-based cell profiling , 2017, Nature Methods.

[39]  Anne E Carpenter,et al.  Pipeline for illumination correction of images for high-throughput microscopy , 2014, Journal of microscopy.

[40]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[41]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[42]  R. Murphy Building cell models and simulations from microscope images. , 2016, Methods.

[43]  Louis-François Handfield,et al.  Local statistics allow quantification of cell-to-cell variability from high-throughput microscope images , 2015, Bioinform..

[44]  Leopold Parts,et al.  Accurate Classification of Protein Subcellular Localization from High-Throughput Microscopy Images Using Deep Learning , 2016, G3: Genes, Genomes, Genetics.

[45]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[46]  Brenda J. Andrews,et al.  Unsupervised Clustering of Subcellular Protein Expression Patterns in High-Throughput Microscopy Images Reveals Protein Complexes and Functional Relationships between Proteins , 2013, PLoS Comput. Biol..

[47]  Alexei A. Efros,et al.  Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Lucas Pelkmans,et al.  Using Cell-to-Cell Variability—A New Era in Molecular Biology , 2012, Science.

[49]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[50]  Mary M. Maleckar,et al.  Generative Modeling with Conditional Autoencoders: Building an Integrated Cell , 2017, 1705.00092.

[51]  Anne E Carpenter,et al.  Automating Morphological Profiling with Generic Deep Convolutional Networks , 2016, bioRxiv.

[52]  Ehud Sass,et al.  Genome-wide SWAp-tag yeast libraries for proteome exploration , 2018, Nature Methods.

[53]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[54]  Michael B. Elowitz,et al.  Pulsatile Dynamics in the Yeast Proteome , 2014, Current Biology.

[55]  Alan M. Moses,et al.  YeastSpotter: accurate and parameter-free web segmentation for microscopy images of yeast cells , 2019, Bioinform..

[56]  Israel Steinfeld,et al.  BMC Bioinformatics BioMed Central , 2008 .

[57]  Song Zhang,et al.  DBMLoc: a Database of proteins with multiple subcellular localizations , 2008, BMC Bioinformatics.

[58]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[59]  Grant W. Brown,et al.  Dissecting DNA damage response pathways by analyzing protein localization and abundance changes during DNA replication stress , 2012, Nature Cell Biology.

[60]  Marc Berndl,et al.  Improving Phenotypic Measurements in High-Content Imaging Screens , 2017, bioRxiv.