Mass Informatics in Differential Proteomics

Systems biology aims to understand biological systems on a comprehensive scale, such that the components that make up the whole are connected to one another and work in harmony. As a major component of systems biology, differential proteomics studies the differences between distinct but related proteomes such as normal versus diseased cells and diseased versus treated cells. High throughput mass spectrometry (MS) based analytical platforms are widely used in differential proteomics (Domon, 2006; Fenselau, 2007). As a common practice, the proteome is usually digested into peptides first. The peptide mixture is then separated using multidimensional liquid chromatography (MDLC) and is finally subjected to MS for further analysis. Thousands of mass spectra are generated in a single experiment. Discovering the significantly changed proteins from millions of peaks involves mass informatics. This paper introduces data mining steps used in mass informatics, and concludes with a descriptive examination of concepts, trends and challenges in this rapidly expanding field.

[1]  F. Regnier,et al.  A method for the identification of glycoproteins from human serum by a combination of lectin affinity chromatography along with anion exchange and Cu-IMAC selection of tryptic peptides. , 2007, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[2]  F. Regnier,et al.  An automated method for the analysis of stable isotope labeling data in proteomics , 2005, Journal of the American Society for Mass Spectrometry.

[3]  Tommi S. Jaakkola,et al.  Maximum-likelihood estimation of optimal scaling factors for expression array normalization , 2001, SPIE BiOS.

[4]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[5]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[6]  O. Kvalheim,et al.  Pretreatment of mass spectral profiles: application to proteomic data. , 2007, Analytical chemistry.

[7]  Hua Tang,et al.  A statistical method for chromatographic alignment of LC-MS data. , 2007, Biostatistics.

[8]  Richard D. Smith,et al.  Robust algorithm for alignment of liquid chromatography-mass spectrometry analyses in an accurate mass and time tag data analysis pipeline. , 2006, Analytical chemistry.

[9]  F. Regnier,et al.  Proteomics of glycoproteins based on affinity selection of glycopeptides from tryptic digests. , 2001, Journal of chromatography. B, Biomedical sciences and applications.

[10]  Umpei Nagashima,et al.  De novo peptide sequencing using ion peak intensity and amino acid cleavage intensity ratio , 2007, Bioinform..

[11]  T. Shaler,et al.  Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. , 2003, Analytical chemistry.

[12]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[13]  Xiang Zhang,et al.  Data pre-processing in liquid chromatography-mass spectrometry-based proteomics , 2005, Bioinform..

[14]  R. Aebersold,et al.  Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry. , 2003, Analytical chemistry.

[15]  C. Fenselau A review of quantitative methods for proteomic studies. , 2007, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[16]  Xiang Zhang,et al.  In-gel stable isotope labeling for relative quantification using mass spectrometry , 2006, Nature Protocols.

[17]  R. Aebersold,et al.  Mass Spectrometry and Protein Analysis , 2006, Science.

[18]  J. Glimm,et al.  Detection of cancer-specific markers amid massive mass spectral data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  J. Yates,et al.  A correlation algorithm for the automated quantitative analysis of shotgun proteomics data. , 2003, Analytical chemistry.

[20]  J. Marchese,et al.  Comparative study of [Three] LC-MALDI workflows for the analysis of complex proteomic samples. , 2005, Journal of proteome research.

[21]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.