Towards a Systematics for Protein Subcellular Location: Quantitative Description of Protein Localization Patterns and Automated Analysis of Fluorescence Microscope Images

Determination of the functions of all expressed proteins represents one of the major upcoming challenges in computational molecular biology. Since subcellular location plays a crucial role in protein function, the availability of systems that can predict location from sequence or high-throughput systems that determine location experimentally will be essential to the full characterization of expressed proteins. The development of prediction systems is currently hindered by an absence of training data that adequately captures the complexity of protein localization patterns. What is needed is a systematics for the subcellular locations of proteins. This paper describes an approach to the quantitative description of protein localization patterns using numerical features and the use of these features to develop classifiers that can recognize all major subcellular structures in fluorescence microscope images. Such classifiers provide a valuable tool for experiments aimed at determining the subcellular distributions of all expressed proteins. The features also have application in automated interpretation of imaging experiments, such as the selection of representative images or the rigorous statistical comparison of protein distributions under different experimental conditions. A key conclusion is that, at least in certain cases, these automated approaches are better able to distinguish similar protein localization patterns than human observers.

[1]  P Bork,et al.  Wanted: subcellular localization of proteins based on sequence. , 1998, Trends in cell biology.

[2]  K. Nakai,et al.  PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. , 1999, Trends in biochemical sciences.

[3]  M.,et al.  Statistical and Structural Approaches to Texture , 2022 .

[4]  James I. Garrels,et al.  YPD-A database for the proteins of Saccharomyces cerevisiae , 1996, Nucleic Acids Res..

[5]  M V Boland,et al.  Toward objective selection of representative microscope images. , 1999, Biophysical journal.

[6]  J. Jarvik,et al.  CD-tagging: a new approach to gene and protein discovery and analysis. , 1996, BioTechniques.

[7]  Paul Horton,et al.  Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier , 1997, ISMB.

[8]  M V Boland,et al.  Automated recognition of patterns characteristic of subcellular structures in fluorescence microscopy images. , 1998, Cytometry.

[9]  M. Kanehisa,et al.  A knowledge base for predicting protein localization sites in eukaryotic cells , 1992, Genomics.

[10]  M. Teague Image analysis via the general theory of moments , 1980 .

[11]  T. W. Ridler,et al.  Picture thresholding using an iterative selection method. , 1978 .

[12]  Stephen S. Taylor,et al.  A Visual Screen of a Gfp-Fusion Library Identifies a New Type of Nuclear Envelope Membrane Protein , 1999, The Journal of cell biology.

[13]  M. Markey,et al.  Classification of protein localization patterns obtained via fluorescence light microscopy , 1997, Proceedings of the 19th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 'Magnificent Milestones and Emerging Opportunities in Medical Engineering' (Cat. No.97CH36136).

[14]  Thomas Jackson,et al.  Neural Computing - An Introduction , 1990 .