Automated image analysis of protein localization in budding yeast

MOTIVATION The yeast Saccharomyces cerevisiae is the first eukaryotic organism to have its genome completely sequenced. Since then, several large-scale analyses of the yeast genome have provided extensive functional annotations of individual genes and proteins. One fundamental property of a protein is its subcellular localization, which provides critical information about how this protein works in a cell. An important project therefore was the creation of the yeast GFP fusion localization database by the University of California, San Francisco, USA (UCSF). This database provides localization data for 75% of the proteins believed to be encoded by the yeast genome. These proteins were classified into 22 distinct subcellular location categories by visual examination. Based on our past success at building automated systems to classify subcellular location patterns in mammalian cells, we sought to create a similar system for yeast. RESULTS We developed computational methods to automatically analyze the images created by the UCSF yeast GFP fusion localization project. The system was trained to recognize the same location categories that were used in that study. We applied the system to 2640 images, and the system gave the same label as the previous assignments to 2139 images (81%). When only the highest confidence assignments were considered, 94.7% agreement was observed. Visual examination of the proteins for which the two approaches disagree suggests that at least some of the automated assignments may be more accurate. The automated method provides an objective, quantitative and repeatable assignment of protein locations that can be applied to new collections of yeast images (e.g. for different strains or the same strain under different conditions). It is also important to note that this performance could be achieved without requiring colocalization with any marker proteins. AVAILABILITY The original images analyzed in this article are available at http://yeastgfp.ucsf.edu, and source code and results are available at http://murphylab.web.cmu.edu/software.

[1]  Karsten Rodenacker,et al.  Quantification of tissue sections : Graph theory and topology as modelling tools , 1990, Pattern Recognit. Lett..

[2]  M. Chalfie GREEN FLUORESCENT PROTEIN , 1995, Photochemistry and photobiology.

[3]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[4]  M V Boland,et al.  Automated recognition of patterns characteristic of subcellular structures in fluorescence microscopy images. , 1998, Cytometry.

[5]  Robert F. Murphy,et al.  Towards a Systematics for Protein Subcellular Location: Quantitative Description of Protein Localization Patterns and Automated Analysis of Fluorescence Microscope Images , 2000, ISMB.

[6]  Robert F. Murphy,et al.  A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells , 2001, Bioinform..

[7]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[8]  Robert F. Murphy,et al.  Automated determination of protein subcellular locations from 3D fluorescence microscope images , 2002, Proceedings IEEE International Symposium on Biomedical Imaging.

[9]  R. Tsien,et al.  A monomeric red fluorescent protein , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Robert F. Murphy,et al.  Location proteomics: building subcellular location trees from high-resolution 3D fluorescence microscope images of randomly tagged proteins , 2003, SPIE BiOS.

[11]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[12]  Robert F. Murphy,et al.  Robust Numerical Features for Description and Classification of Subcellular Location Patterns in Fluorescence Microscope Images , 2003, J. VLSI Signal Process..

[13]  Kai Huang,et al.  Feature reduction for improved recognition of subcellular location patterns in fluorescence microscope images , 2003, SPIE BiOS.

[14]  Kai Huang,et al.  Automated classification of subcellular patterns in multicell images without segmentation into single cells , 2004, 2004 2nd IEEE International Symposium on Biomedical Imaging: Nano to Macro (IEEE Cat No. 04EX821).

[15]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[16]  Robert F Murphy,et al.  From quantitative microscopy to automated image understanding. , 2004, Journal of biomedical optics.

[17]  Joakim Lindblad,et al.  Robust Cell Image Segmentation Methods , 2004 .

[18]  R. Murphy,et al.  Objective Clustering of Proteins Based on Subcellular Location Patterns , 2005, Journal of biomedicine & biotechnology.

[19]  V. Iyer,et al.  Systematic profiling of cellular phenotypes with spotted cell microarrays reveals mating-pheromone response genes , 2006, Genome Biology.

[20]  Shoshana J. Wodak,et al.  CYGD: the Comprehensive Yeast Genome Database , 2004, Nucleic Acids Res..

[21]  Robert F. Murphy,et al.  A Novel Graphical Model Approach to Segmenting Cell Images , 2006, 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.