Accurate classification of protein subcellular localization from high throughput microscopy images using deep learning

High throughput microscopy of many single cells generates high-dimensional data that are far from straightforward to analyze. One important problem is automatically detecting the cellular compartment where a fluorescently tagged protein resides, a task relatively simple for an experienced human, but difficult to automate on a computer. Here, we train an 11-layer neural network on data from mapping thousands of yeast proteins, achieving per cell localization classification accuracy of 91%, and per protein accuracy of 99% on held out images. We confirm that low-level network features correspond to basic image characteristics, while deeper layers separate localization classes. Using this network as a feature calculator, we train standard classifiers that assign proteins to previously unseen compartments after observing only a small number of training examples. Our results are the most accurate subcellular localization classifications to date, and demonstrate the usefulness of deep learning for high throughput microscopy.

[1]  Louis-François Handfield,et al.  Local statistics allow quantification of cell-to-cell variability from high-throughput microscope images , 2015, Bioinform..

[2]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[3]  Lior Shamir,et al.  Pattern Recognition Software and Techniques for Biological Image Analysis , 2010, PLoS Comput. Biol..

[4]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[5]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[6]  von F. Zernike Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode , 1934 .

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Robert F. Murphy,et al.  Automated image analysis of protein localization in budding yeast , 2007, ISMB/ECCB.

[9]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Adam P. Rosebrock,et al.  Heritability and genetic basis of protein level variation in an outbred population , 2014, Genome research.

[11]  Jean-Karim Hériché,et al.  Systematic Cell Phenotyping , 2014 .

[12]  Anne E Carpenter,et al.  Using CellProfiler for Automatic Identification and Measurement of Biological Objects in Images , 2008, Current protocols in molecular biology.

[13]  Wolfgang Huber,et al.  EBImage—an R package for image processing with applications to cellular phenotypes , 2010, Bioinform..

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Tony J Collins,et al.  ImageJ for microscopy. , 2007, BioTechniques.

[16]  C. Conrad,et al.  Automatic identification of subcellular phenotypes on human cell arrays. , 2004, Genome research.

[17]  Robert F. Murphy,et al.  A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells , 2001, Bioinform..

[18]  M V Boland,et al.  Automated recognition of patterns characteristic of subcellular structures in fluorescence microscopy images. , 1998, Cytometry.

[19]  Leopold Parts,et al.  SGAtools: one-stop analysis and visualization of array-based genetic interaction screens , 2013, Nucleic Acids Res..

[20]  R. Murphy,et al.  Automated subcellular location determination and high-throughput microscopy. , 2007, Developmental cell.

[21]  Yolanda T. Chong,et al.  CYCLoPs: A Comprehensive Database Constructed from Automated Analysis of Protein Abundance and Subcellular Localization Patterns in Saccharomyces cerevisiae , 2015, G3: Genes, Genomes, Genetics.

[22]  Anna Goldenberg,et al.  TensorFlow: Biology's Gateway to Deep Learning? , 2016, Cell systems.

[23]  R. Milo,et al.  Noise Genetics: Inferring Protein Function by Correlating Phenotype with Protein Levels and Localization in Individual Human Cells , 2014, PLoS genetics.

[24]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25]  Robert F. Murphy,et al.  Robust Numerical Features for Description and Classification of Subcellular Location Patterns in Fluorescence Microscope Images , 2003, J. VLSI Signal Process..

[26]  John M. Hancock,et al.  Phenomics of the Laboratory Mouse , 2014 .

[27]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[28]  Anil K. Jain,et al.  Object detection using gabor filters , 1997, Pattern Recognit..

[29]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[30]  Taro L. Saito,et al.  High-dimensional and large-scale phenotyping of yeast mutants. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Franco J. Vizeacoumar,et al.  Integrating high-throughput genetic interaction mapping and high-content screening to explore yeast spindle morphogenesis , 2010, The Journal of cell biology.

[32]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Leonid Kruglyak,et al.  Genetics of single-cell protein abundance variation in large yeast populations , 2013 .

[34]  L. Parts,et al.  gitter: A Robust and Accurate Method for Quantification of Colony Sizes From Plate Images , 2014, G3: Genes, Genomes, Genetics.

[35]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[36]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[37]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[38]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[39]  David R. Kelley,et al.  Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks , 2015, bioRxiv.

[40]  Luca Maria Gambardella,et al.  Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks , 2013, MICCAI.

[41]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  M.,et al.  Statistical and Structural Approaches to Texture , 2022 .

[43]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[44]  Yolanda T. Chong,et al.  Yeast Proteome Dynamics from Single Cell Imaging and Automated Analysis , 2015, Cell.

[45]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[46]  A. Danckaert,et al.  Automated Recognition of Intracellular Organelles in Confocal Microscope Images , 2002, Traffic.

[47]  Luca Maria Gambardella,et al.  Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images , 2012, NIPS.

[48]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.