Towards defining the nuclear proteome

BackgroundThe nucleus is a complex cellular organelle and accurately defining its protein content is essential before any systematic characterization can be considered.ResultsWe report direct evidence for 2,568 mammalian proteins within the nuclear proteome: the nuclear subcellular localization of 1,529 proteins based on a high-throughput subcellular localization protocol of full-length proteins and an additional 1,039 proteins for which clear experimental evidence is documented in published literature. This is direct evidence that the nuclear proteome consists of at least 14% of the entire proteome. This dataset was used to evaluate computational approaches designed to identify additional nuclear proteins.ConclusionThis represents direct experimental evidence that the nuclear proteome consists of at least 14% of the entire proteome. This high-quality nuclear proteome dataset was used to evaluate computational approaches designed to identify additional nuclear proteins. Based on this analysis, researchers can determine the stringency and types of lines of evidence they consider to infer the size and complement of the nuclear proteome.

[1]  J. Kawai,et al.  A genome-wide and nonredundant mouse transcription factor database. , 2004, Biochemical and biophysical research communications.

[2]  Stefan Wiemann,et al.  Erratum: Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing (EMBO reports (2000) 1 (287-292) DOI: 10.1093/embo-reports/kvd058) , 2009 .

[3]  T. Misteli,et al.  Genomes, proteomes, and dynamic networks in the cell nucleus , 2002, Histochemistry and Cell Biology.

[4]  A. Poustka,et al.  Systematic subcellular localization of novel proteins identified by large‐scale cDNA sequencing , 2000, EMBO reports.

[5]  L. J. Terry,et al.  Crossing the Nuclear Envelope: Hierarchical Regulation of Nucleocytoplasmic Transport , 2007, Science.

[6]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Graham Dellaire,et al.  The Nuclear Protein Database (NPD): sub-nuclear localisation and functional annotation of the nuclear proteome , 2003, Nucleic Acids Res..

[8]  R. Pepperkok,et al.  The subcellular localization of the mammalian proteome comes a fraction closer , 2006, Genome Biology.

[9]  Nicholas A. Hamilton,et al.  LOCATE: a mammalian protein subcellular localization database , 2007, Nucleic Acids Res..

[10]  S. Salzberg,et al.  The Transcriptional Landscape of the Mammalian Genome , 2005, Science.

[11]  Marjan S. Bolouri,et al.  Integrated Analysis of Protein Composition, Tissue Diversity, and Gene Regulation in Mouse Mitochondria , 2003, Cell.

[12]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[13]  Josefine Sprenger,et al.  Evaluation and comparison of mammalian subcellular localization prediction methods , 2006, BMC Bioinformatics.

[14]  Chittibabu Guda,et al.  pTARGET: a web server for predicting protein subcellular localization , 2006, Nucleic Acids Res..

[15]  Markus Brameier,et al.  Evolving Regular Expression-Based Sequence Classifiers for Protein Nuclear Localisation , 2004, EvoWorkshops.

[16]  Stefan Wiemann,et al.  LIFEdb: a database for functional genomics experiments integrating information from external sources, and serving as a sample tracking system , 2004, Nucleic Acids Res..

[17]  Jun Kawai,et al.  Subcellular Localization of Mammalian Type II Membrane Proteins , 2006, Traffic.

[18]  W. Bickmore,et al.  Large-scale identification of mammalian proteins localized to nuclear sub-compartments. , 2001, Human molecular genetics.

[19]  Piero Carninci,et al.  Transcriptional network dynamics in macrophage activation. , 2006, Genomics.

[20]  Paul Horton,et al.  PROTEIN SUBCELLULAR LOCALIZATION PREDICTION WITH WOLF PSORT , 2005 .

[21]  Jeremy C Simpson,et al.  Localizing the proteome , 2003, Genome Biology.

[22]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[23]  Jun Kawai,et al.  Differential Use of Signal Peptides and Membrane Domains Is a Common Occurrence in the Protein Output of Transcriptional Units , 2006, PLoS genetics.

[24]  Anthony K. L. Leung,et al.  Nucleolar proteome dynamics , 2005, Nature.

[25]  Jun Kawai,et al.  LOCATE: a mouse protein subcellular localization database , 2005, Nucleic Acids Res..

[26]  Michelle S. Scott,et al.  Global Survey of Organ and Organelle Protein Expression in Mouse: Combined Proteomic and Transcriptomic Profiling , 2006, Cell.

[27]  Jenn-Kang Hwang,et al.  Prediction of protein subcellular localization , 2006, Proteins.

[28]  Donald G. Gilbert,et al.  euGenes: a eukaryote genome information system , 2002, Nucleic Acids Res..

[29]  Burkhard Rost,et al.  NLSdb: database of nuclear localization signals , 2003, Nucleic Acids Res..

[30]  Oliver Kohlbacher,et al.  MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition , 2006, Bioinform..

[31]  Oliver Hofmann,et al.  The LIFEdb database in 2006 , 2006, Nucleic Acids Res..

[32]  John Hawkins,et al.  Predicting nuclear localization. , 2007, Journal of proteome research.

[33]  Zhiyong Lu,et al.  Proteome Analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations , 2004, Nucleic Acids Res..

[34]  B. Rost,et al.  Finding nuclear localization signals , 2000, EMBO reports.

[35]  Zhiyong Lu,et al.  Predicting subcellular localization of proteins using machine-learned classifiers , 2004, Bioinform..

[36]  Yoshihide Hayashizaki,et al.  PhosphoregDB: The tissue and sub-cellular distribution of mammalian protein kinases and phosphatases , 2006, BMC Bioinformatics.