Creation of a federated database of blood proteins: a powerful new tool for finding and characterizing biomarkers in serum

Protein biomarkers offer major benefits for diagnosis and monitoring of disease processes. Recent advances in protein mass spectrometry make it feasible to use this very sensitive technology to detect and quantify proteins in blood. To explore the potential of blood biomarkers, we conducted a thorough review to evaluate the reliability of data in the literature and to determine the spectrum of proteins reported to exist in blood with a goal of creating a Federated Database of Blood Proteins (FDBP). A unique feature of our approach is the use of a SQL database for all of the peptide data; the power of the SQL database combined with standard informatic algorithms such as BLAST and the statistical analysis system (SAS) allowed the rapid annotation and analysis of the database without the need to create special programs to manage the data. Our mathematical analysis and review shows that in addition to the usual secreted proteins found in blood, there are many reports of intracellular proteins and good agreement on transcription factors, DNA remodelling factors in addition to cellular receptors and their signal transduction enzymes. Overall, we have catalogued about 12,130 proteins identified by at least one unique peptide, and of these 3858 have 3 or more peptide correlations. The FDBP with annotations should facilitate testing blood for specific disease biomarkers.

[1]  S. Hanash,et al.  Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study , 2006, Nature Biotechnology.

[2]  S. Hanash,et al.  BiomarkerDigger: A versatile disease proteome database and analysis platform for the identification of plasma cancer biomarkers , 2009, Proteomics.

[3]  Ronald J. Moore,et al.  Toward a Human Blood Serum Proteome , 2002, Molecular & Cellular Proteomics.

[4]  Eugene A. Kapp,et al.  Overview of the HUPO Plasma Proteome Project: Results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly‐available database , 2005, Proteomics.

[5]  B. O’Malley,et al.  A thyroid hormone receptor coactivator negatively regulated by the retinoblastoma protein. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[6]  A. Zychlinsky,et al.  Neutrophil Extracellular Traps Kill Bacteria , 2004, Science.

[7]  C. Schaefer,et al.  Analysis of the human serum proteome , 2004, Clinical Proteomics.

[8]  K. Evans,et al.  Endogenous peptides from biophysical and biochemical fractionation of serum analyzed by matrix-assisted laser desorption/ionization and electrospray ionization hybrid quadrupole time-of-flight. , 2007, Analytical biochemistry.

[9]  J. von Pawel,et al.  Long-term stability of circulating nucleosomes in serum. , 2010, Anticancer research.

[10]  D. Na,et al.  Molecular Cloning and Characterization of CAPER, a Novel Coactivator of Activating Protein-1 and Estrogen Receptors* , 2002, The Journal of Biological Chemistry.

[11]  Joel Rovnak,et al.  SWAN-1, a Caenorhabditis elegans WD Repeat Protein of the AN11 Family, Is a Negative Regulator of Rac GTPase Function , 2006, Genetics.

[12]  Lukas N. Mueller,et al.  An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data. , 2008, Journal of proteome research.

[13]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[14]  T. Veenstra,et al.  Characterization of the Low Molecular Weight Human Serum Proteome*S , 2003, Molecular & Cellular Proteomics.

[15]  J. Szemraj,et al.  Mast Cell—Derived Exosomes Activate Endothelial Cells to Secrete Plasminogen Activator Inhibitor Type 1 , 2005, Arteriosclerosis, thrombosis, and vascular biology.

[16]  M. Mann,et al.  MSQuant, an open source platform for mass spectrometry-based quantitative proteomics. , 2010, Journal of proteome research.

[17]  Rong-Fong Shen,et al.  Proteomic profiling of human plasma exosomes identifies PPARgamma as an exosome-associated protein. , 2009, Biochemical and biophysical research communications.

[18]  Peiyong Jiang,et al.  Mutational Profile of the Fetus Maternal Plasma DNA Sequencing Reveals the Genome-Wide Genetic and , 2010 .

[19]  K. Evans,et al.  Precipitation and selective extraction of human serum endogenous peptides with analysis by quadrupole time-of-flight mass spectrometry reveals posttranslational modifications and low-abundance peptides , 2010, Analytical and bioanalytical chemistry.

[20]  Veronika A. Glukhova,et al.  Contribution of protein fractionation to depth of analysis of the serum and plasma proteomes. , 2007, Journal of proteome research.

[21]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[22]  Ronald J Moore,et al.  Characterization of the human blood plasma proteome , 2005, Proteomics.

[23]  Ronald J. Moore,et al.  Targeted quantification of low ng/mL level proteins in human serum without immunoaffinity depletion. , 2013, Journal of proteome research.

[24]  Douglas A Jeffery,et al.  "Product ion monitoring" assay for prostate-specific antigen in serum using a linear ion-trap. , 2008, Journal of proteome research.

[25]  Ronald J Moore,et al.  Ultra-high-efficiency strong cation exchange LC/RPLC/MS/MS for high dynamic range characterization of the human plasma proteome. , 2004, Analytical chemistry.

[26]  E. Diamandis,et al.  Identification and quantification of peptides and proteins secreted from prostate epithelial cells by unbiased liquid chromatography tandem mass spectrometry using goodness of fit and analysis of variance. , 2012, Journal of proteomics.

[27]  M. V. van Dijk,et al.  Urine levels of HMGB1 in Systemic Lupus Erythematosus patients with and without renal manifestations , 2012, Arthritis Research & Therapy.

[28]  Peter Bowden,et al.  Capture and Qualitative Analysis of the Activated Fc Receptor Complex from Live Cells , 2012, Current protocols in protein science.

[29]  M. Tucholska,et al.  Chi-square comparison of tryptic peptide-to-protein distributions of tandem mass spectrometry from blood with those of random expectation. , 2011, Analytical biochemistry.

[30]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[31]  Lianbo Yu,et al.  Detection of microRNA Expression in Human Peripheral Blood Microvesicles , 2008, PloS one.

[32]  R. Beavis,et al.  Tandem mass spectrometry of human tryptic blood peptides calculated by a statistical algorithm and captured by a relational database with exploration by a general statistical analysis system. , 2009, Journal of proteomics.

[33]  R. Simpson,et al.  Proteomic profiling of exosomes: Current perspectives , 2008, Proteomics.

[34]  K. Tatsch,et al.  Predictive and prognostic value of circulating nucleosomes and serum biomarkers in patients with metastasized colorectal cancer undergoing Selective Internal Radiation Therapy , 2012, BMC Cancer.

[35]  Christian von Mering,et al.  STRING: known and predicted protein–protein associations, integrated and transferred across organisms , 2004, Nucleic Acids Res..

[36]  B. Cargile,et al.  Potential for false positive identifications from large databases through tandem mass spectrometry. , 2004, Journal of proteome research.

[37]  Roger E. Moore,et al.  Qscore: An algorithm for evaluating SEQUEST database search results , 2002, Journal of the American Society for Mass Spectrometry.

[38]  G. Jackowski,et al.  Human serum proteins preseparated by electrophoresis or chromatography followed by tandem mass spectrometry. , 2004, Journal of proteome research.

[39]  U. Gezer,et al.  Correlation of histone methyl marks with circulating nucleosomes in blood plasma of cancer patients. , 2012, Oncology letters.

[40]  Peihong Zhu,et al.  Meta sequence analysis of human blood peptides and their parent proteins. , 2010, Journal of proteomics.

[41]  Du Zhang,et al.  Peptide-to-protein distribution versus a competition for significance to estimate error rate in blood protein identification. , 2011, Analytical biochemistry.

[42]  D. Weitz,et al.  Anucleate platelets generate progeny. , 2010, Blood.

[43]  Juri Rappsilber,et al.  Proteomic analysis of human blood serum using peptide library beads. , 2007, Journal of proteome research.

[44]  H. Thiele,et al.  Quantitative statistical analysis of standard and human blood proteins from liquid chromatography, electrospray ionization, and tandem mass spectrometry. , 2012, Journal of proteome research.

[45]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[46]  J. Lanner Ryanodine receptor physiology and its role in disease. , 2012, Advances in experimental medicine and biology.

[47]  J. Yates,et al.  Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. , 1995, Analytical chemistry.

[48]  Gerard Pasterkamp,et al.  Exosome secreted by MSC reduces myocardial ischemia/reperfusion injury. , 2010, Stem cell research.

[49]  Du Zhang,et al.  Mass spectrometry of peptides and proteins from human blood. , 2011, Mass spectrometry reviews.

[50]  Fotini Betsou,et al.  Identification of evidence-based biospecimen quality-control tools: a report of the International Society for Biological and Environmental Repositories (ISBER) Biospecimen Science Working Group. , 2013, The Journal of molecular diagnostics : JMD.

[51]  N. Anderson,et al.  The Human Plasma Proteome , 2002, Molecular & Cellular Proteomics.

[52]  Xian Chen,et al.  Abundance- and Activity-Based Proteomics in Platelet Biology. , 2011, Current proteomics.

[53]  B. Mcclintock,et al.  The Stability of Broken Ends of Chromosomes in Zea Mays. , 1941, Genetics.

[54]  D. Gnatenko,et al.  The platelet proteome , 2009, Current opinion in hematology.

[55]  Edward R. Ashwood,et al.  Tietz Fundamentals of Clinical Chemistry , 1996 .

[56]  F. Putnam The plasma proteins: Structure, function, and genetic control , 1975 .

[57]  M. Tucholska,et al.  Human serum proteins fractionated by preparative partition chromatography prior to LC-ESI-MS/MS. , 2009, Journal of proteome research.

[58]  C. Nüsslein-Volhard,et al.  Mutations affecting segment number and polarity in Drosophila , 1980, Nature.

[59]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[60]  Monika Tucholska,et al.  The endogenous peptides of normal human serum extracted from the acetonitrile-insoluble precipitate using modified aqueous buffer with analysis by LC-ESI-Paul ion trap and Qq-TOF. , 2010, Journal of proteomics.

[61]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[62]  Christoph H Borchers,et al.  Multi-site assessment of the precision and reproducibility of multiple reaction monitoring–based measurements of proteins in plasma , 2009, Nature Biotechnology.

[63]  N. Tietz Fundamentals of Clinical Chemistry , 1970 .

[64]  A. Guha,et al.  Intercellular transfer of the oncogenic receptor EGFRvIII by microvesicles derived from tumour cells , 2008, Nature Cell Biology.

[65]  D. King,et al.  Identification of citrullinated histone H3 as a potential serum protein biomarker in a lethal model of lipopolysaccharide-induced shock. , 2011, Surgery.

[66]  E. Diamandis,et al.  Comparison of protein expression lists from mass spectrometry of human blood fluids using exact peptide sequences versus BLAST , 2006, Clinical Proteomics.

[67]  Peter Bowden,et al.  The Fc receptor-cytoskeleton complex from human neutrophils. , 2011, Journal of proteomics.