Edinburgh Research Explorer The mouse secretome

We have developed a computational strategy to identify the set of soluble proteins secreted into the extracellular environment of a cell. Within the protein sequences predominantly derived from the RIKEN representative transcript and protein set, we identified 2033 unique soluble proteins that are potentially secreted from the cell. These proteins contain a signal peptide required for entry into the secretory pathway and lack any transmembrane domains or intracellular localization signals. This class of proteins, which we have termed the mouse secretome, included >500 novel proteins and 92 proteins <100 amino acids in length. Functional analysis of the secretome included identification of human orthologs, functional units based on InterPro and SCOP Superfamily predictions, and expression of the proteins within the RIKEN READ microarray database. To highlight the utility of this information, we discuss the CUB domain-containing protein family. [Supplemental material The RIKEN Mouse Gene Encyclopedia project aims to identify the full set of transcripts that are derived from the mouse genome (The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I and II Team 2002). The 60,770 cDNA clones fully sequenced in the RIKEN project were selected from 246 full-length, enriched cDNA libraries derived from a range of tissue sources predominantly from C57BL/6J mice.

[1]  Rolf Apweiler,et al.  Proteome Analysis Database , 2004 .

[2]  Melissa J. Davis,et al.  Mouse proteome analysis. , 2003, Genome research.

[3]  S. Batalov,et al.  Analysis of the mouse transcriptome for genes involved in the function of the nervous system. , 2003, Genome research.

[4]  E. Birney,et al.  Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs , 2002, Nature.

[5]  Alex Bateman,et al.  InterPro: An Integrated Documentation Resource for Protein Families, Domains and Functional Sites , 2002, Briefings Bioinform..

[6]  Laurence Zitvogel,et al.  Exosomes: composition, biogenesis and function , 2002, Nature Reviews Immunology.

[7]  K. Karplus,et al.  Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. , 2001, Journal of molecular biology.

[8]  G. Rice,et al.  Proteomic analysis of human plasma: Failure of centrifugal ultrafiltration to remove albumin and other high molecular weight proteins , 2001, Proteomics.

[9]  M. Gerstein,et al.  Interrelating different types of genomic data, from proteome to secretome: 'oming in on function. , 2001, Genome research.

[10]  M B Eisen,et al.  Delineating developmental and metabolic pathways in vivo by expression profiling using the RIKEN set of 18,816 full-length enriched mouse cDNA arrays , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[11]  S. Grimmond,et al.  Cloning, mapping, and expression analysis of a gene encoding a novel mammalian EGF-related protein (SCUBE1). , 2000, Genomics.

[12]  R. Hughes Secretion of the galectin family of mammalian carbohydrate-binding proteins. , 1999, Biochimica et biophysica acta.

[13]  Paul A. Gleeson,et al.  Targeting of proteins to the Golgi apparatus , 1998, Histochemistry and Cell Biology.

[14]  S. Barondes,et al.  A new pathway for protein export in Saccharomyces cerevisiae , 1996, The Journal of cell biology.

[15]  S. White,et al.  Structure, function, and membrane integration of defensins. , 1995, Current opinion in structural biology.

[16]  P. Bork,et al.  The CUB domain. A widespread module in developmentally regulated proteins. , 1993, Journal of molecular biology.

[17]  H. Pelham,et al.  The retention signal for soluble proteins of the endoplasmic reticulum. , 1990, Trends in biochemical sciences.

[18]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[19]  B. Rost,et al.  State-of-the-art in membrane protein prediction. , 2002, Applied bioinformatics.

[20]  Yoshihide Hayashizaki,et al.  READ: RIKEN Expression Array Database , 2002, Nucleic Acids Res..

[21]  Cyrus Chothia,et al.  SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments , 2002, Nucleic Acids Res..

[22]  S. Grimmond,et al.  Gene expression pattern Expression of a novel mammalian epidermal growth factor-related gene during mouse neural development , 2001 .

[23]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[24]  Rolf Apweiler,et al.  Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes , 2001, Nucleic Acids Res..

[25]  D. Hume,et al.  Localization and post-Golgi trafficking of tumor necrosis factor-alpha in macrophages. , 2000, Journal of interferon & cytokine research : the official journal of the International Society for Interferon and Cytokine Research.

[26]  Edinburgh Research Explorer Development and evaluation of an automated annotation pipeline and cDNA annotation system , 2022 .