COMPARTMENTS: unification and visualization of protein subcellular localization evidence

Information on protein subcellular localization is important to understand the cellular functions of proteins. Currently, such information is manually curated from the literature, obtained from high-throughput microscopy-based screens and predicted from primary sequence. To get a comprehensive view of the localization of a protein, it is thus necessary to consult multiple databases and prediction tools. To address this, we present the COMPARTMENTS resource, which integrates all sources listed above as well as the results of automatic text mining. The resource is automatically kept up to date with source databases, and all localization evidence is mapped onto common protein identifiers and Gene Ontology terms. We further assign confidence scores to the localization evidence to facilitate comparison of different types and sources of evidence. To further improve the comparability, we assign confidence scores based on the type and source of the localization evidence. Finally, we visualize the unified localization evidence for a protein on a schematic cell to provide a simple overview. Database URL: http://compartments.jensenlab.org

[1]  Kimberly Van Auken,et al.  WormBase: a comprehensive resource for nematode research , 2009, Nucleic Acids Res..

[2]  Jim Thurmond,et al.  FlyBase 101 – the basics of navigating FlyBase , 2011, Nucleic Acids Res..

[3]  Y. Yoneda,et al.  Nucleocytoplasmic protein traffic and its significance to cell function , 2000, Genes to cells : devoted to molecular & cellular mechanisms.

[4]  Laurent Gil,et al.  Ensembl 2013 , 2012, Nucleic Acids Res..

[5]  Michel Schneider,et al.  The UniProtKB/Swiss-Prot knowledgebase and its Plant Proteome Annotation Program. , 2009, Journal of proteomics.

[6]  Hans-Michael Müller,et al.  Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature , 2004, PLoS biology.

[7]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse , 2011, Nucleic Acids Res..

[8]  Ilse Vanhorebeek,et al.  Absence of peroxisomes in mouse hepatocytes causes mitochondrial and ER abnormalities , 2005, Hepatology.

[9]  Cheng-Gee Koh,et al.  Actin cytoskeleton dynamics and the cell division cycle. , 2010, The international journal of biochemistry & cell biology.

[10]  Burkhard Rost,et al.  LocDB: experimental annotations of localization for Homo sapiens and Arabidopsis thaliana , 2010, Nucleic Acids Res..

[11]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[12]  M. Mann,et al.  Directed Proteomic Analysis of the Human Nucleolus , 2002, Current Biology.

[13]  Jieyue Li,et al.  Automated Analysis and Reannotation of Subcellular Locations in Confocal Images from the Human Protein Atlas , 2012, PloS one.

[14]  Markus Islinger,et al.  The peroxisome: an update on mysteries , 2012, Histochemistry and Cell Biology.

[15]  K. Nakai,et al.  Prediction of subcellular locations of proteins: Where to proceed? , 2010, Proteomics.

[16]  Ira Mellman,et al.  The Road Taken: Past and Future Review Foundations of Membrane Traffic , 2000 .

[17]  M. Gerstein,et al.  Subcellular localization of the yeast proteome. , 2002, Genes & development.

[18]  A. Poustka,et al.  Systematic subcellular localization of novel proteins identified by large‐scale cDNA sequencing , 2000, EMBO reports.

[19]  Christophe Dessimoz,et al.  Quality of Computationally Inferred Gene Ontology Annotations , 2012, PLoS Comput. Biol..

[20]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[21]  Oliver Kohlbacher,et al.  YLoc—an interpretable web server for predicting subcellular localization , 2010, Nucleic Acids Res..

[22]  Paul Schimmel,et al.  M411_3c 107..110 , 2001 .

[23]  Paul Horton,et al.  Nucleic Acids Research Advance Access published May 21, 2007 WoLF PSORT: protein localization predictor , 2007 .

[24]  Xinglai Ji,et al.  BSubLoc: database of protein subcellular localization , 2004, Nucleic Acids Res..

[25]  S. Davies,et al.  Big brother : Britain's web of surveillance and the new technological order , 1996 .

[26]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[27]  Edith D. Wong,et al.  Saccharomyces Genome Database: the genomics resource of budding yeast , 2011, Nucleic Acids Res..

[28]  C. Watts,et al.  The endosome–lysosome pathway and information generation in the immune system☆ , 2012, Biochimica et biophysica acta.

[29]  Henning Urlaub,et al.  Conservation of the Protein Composition and Electron Microscopy Structure of Drosophila melanogaster and Human Spliceosomal Complexes , 2008, Molecular and Cellular Biology.

[30]  E. Lundberg,et al.  Towards a knowledge-based Human Protein Atlas , 2010, Nature Biotechnology.

[31]  Ian R. Castleden,et al.  SUBA3: a database for integrating experimentation and prediction to define the SUBcellular location of proteins in Arabidopsis , 2012, Nucleic Acids Res..

[32]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[33]  M. Schrader,et al.  Organelle dynamics and dysfunction: A closer link between peroxisomes and mitochondria , 2009, Journal of Inherited Metabolic Disease.

[34]  S. Carr,et al.  A Mitochondrial Protein Compendium Elucidates Complex I Disease Biology , 2008, Cell.

[35]  Yoshio Umezawa,et al.  A genetic approach to identifying mitochondrial proteins , 2003, Nature Biotechnology.

[36]  A. Moser,et al.  Functions of plasmalogen lipids in health and disease. , 2012, Biochimica et biophysica acta.

[37]  Jyoti S. Choudhary,et al.  Proteomics Characterization of Abundant Golgi Membrane Proteins* , 2001, The Journal of Biological Chemistry.

[38]  J. Mulder,et al.  Contribution of antibody-based protein profiling to the human Chromosome-centric Proteome Project (C-HPP). , 2013, Journal of proteome research.

[39]  Burkhard Rost,et al.  Supporting online material for : LocTree 2 predicts localization for all domains of life , 2012 .

[40]  David S. Wishart,et al.  Nucleic Acids Research Polysearch: a Web-based Text Mining System for Extracting Relationships between Human Diseases, Genes, Mutations, Drugs Polysearch: a Web-based Text Mining System for Extracting Relationships between Human Diseases, Genes, Mutations, Drugs and Metabolites , 2008 .

[41]  C. Watts,et al.  Antigen processing in the endocytic compartment. , 2001, Current opinion in immunology.

[42]  Piero Fariselli,et al.  eSLDB: eukaryotic subcellular localization database , 2006, Nucleic Acids Res..

[43]  Oliver Kohlbacher,et al.  Going from where to why—interpretable prediction of protein subcellular localization , 2010, Bioinform..

[44]  L. Jensen,et al.  The SPECIES and ORGANISMS Resources for Fast and Accurate Identification of Taxonomic Names in Text , 2013, PloS one.

[45]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[46]  Chris Mungall,et al.  AmiGO: online access to ontology and annotation data , 2008, Bioinform..

[47]  Paul Horton,et al.  Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier , 1997, ISMB.

[48]  Jeffrey L. Wrana,et al.  Clathrin- and non-clathrin-mediated endocytic regulation of cell signalling , 2005, Nature Reviews Molecular Cell Biology.

[49]  Nicholas A. Hamilton,et al.  LOCATE: a mammalian protein subcellular localization database , 2007, Nucleic Acids Res..

[50]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[51]  Piero Fariselli,et al.  BaCelLo: a balanced subcellular localization predictor , 2006, ISMB.

[52]  Michael Schrader,et al.  Mitochondria and peroxisomes: Are the ‘Big Brother’ and the ‘Little Sister’ closer than assumed? , 2007, BioEssays : news and reviews in molecular, cellular and developmental biology.

[53]  Kimberly Van Auken,et al.  Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR , 2012, Database J. Biol. Databases Curation.