Analytical model of peptide mass cluster centres with applications

BackgroundThe elemental composition of peptides results in formation of distinct, equidistantly spaced clusters across the mass range. The property of peptide mass clustering is used to calibrate peptide mass lists, to identify and remove non-peptide peaks and for data reduction.ResultsWe developed an analytical model of the peptide mass cluster centres. Inputs to the model included, the amino acid frequencies in the sequence database, the average length of the proteins in the database, the cleavage specificity of the proteolytic enzyme used and the cleavage probability. We examined the accuracy of our model by comparing it with the model based on an in silico sequence database digest. To identify the crucial parameters we analysed how the cluster centre location depends on the inputs. The distance to the nearest cluster was used to calibrate mass spectrometric peptide peak-lists and to identify non-peptide peaks.ConclusionThe model introduced here enables us to predict the location of the peptide mass cluster centres. It explains how the location of the cluster centres depends on the input parameters. Fast and efficient calibration and filtering of non-peptide peaks is achieved by a distance measure suggested by Wool and Smilansky.

[1]  H. Lehrach,et al.  A calibration method that simplifies and improves accurate determination of peptide molecular masses by MALDI-TOF MS. , 2002, Analytical chemistry.

[2]  Knut Reinert,et al.  Calibration of mass spectrometric peptide mass fingerprint data without specific external or internal calibrants , 2005, BMC Bioinformatics.

[3]  Fredrik Levander,et al.  Modular, scriptable and automated analysis tools for high-throughput peptide mass fingerprinting , 2004, Bioinform..

[4]  D. Hochstrasser,et al.  Modeling peptide mass fingerprinting data using the atomic composition of peptides , 1999, Electrophoresis.

[5]  Joachim Klose,et al.  Two‐dimensional electrophoresis of proteins: An updated protocol and implications for a functional analysis of the genome , 1995, Electrophoresis.

[6]  K. Borzym,et al.  Complete genome sequence of the marine planctomycete Pirellula sp. strain 1 , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Andrew Emili,et al.  In silico proteome analysis to facilitate proteomics experiments using mass spectrometry , 2003, Proteome Science.

[8]  J. Yates,et al.  Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility. , 2003, Analytical chemistry.

[9]  Thorsteinn S. Rögnvaldsson,et al.  Automated methods for improved protein identification by peptide mass fingerprinting , 2004, Proteomics.

[10]  P. Højrup,et al.  Use of mass spectrometric molecular weight information to identify proteins in sequence databases. , 1993, Biological mass spectrometry.

[11]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence Project: update and current status , 2003, Nucleic Acids Res..

[12]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[13]  P. Højrup,et al.  Rapid identification of proteins by peptide-mass fingerprinting , 1993, Current Biology.

[14]  S. Patterson Data analysis—the Achilles heel of proteomics , 2003, Nature Biotechnology.

[15]  Joachim Klose,et al.  Interpretation of mass spectrometry data for high-throughput proteomics , 2003, Analytical and bioanalytical chemistry.

[16]  C. G. Edmonds,et al.  New developments in biochemical mass spectrometry: electrospray ionization. , 1990, Analytical chemistry.

[17]  R. Apweiler Protein sequence databases. , 2000, Advances in protein chemistry.

[18]  R D Appel,et al.  Improving protein identification from peptide mass fingerprinting through a parameterized multi‐level scoring algorithm and an optimized peak detection , 1999, Electrophoresis.

[19]  Brian D. Ripley,et al.  Modern Applied Statistics with S Fourth edition , 2002 .

[20]  Assaf Wool,et al.  Precalibration of matrix‐assisted laser desorption/ionization‐time of flight spectra for peptide mass fingerprinting , 2002, Proteomics.

[21]  R S Johnson,et al.  Novel fragmentation process of peptides by collision-induced decomposition in a tandem mass spectrometer: differentiation of leucine and isoleucine. , 1987, Analytical chemistry.

[22]  R. Becklin,et al.  Development of an LC-MALDI method for the analysis of protein complexes , 2004, Journal of the American Society for Mass Spectrometry.

[23]  E. Nordhoff,et al.  Alpha-cyano-4-hydroxycinnamic acid affinity sample preparation. A protocol for MALDI-MS peptide analysis in proteomics. , 2001, Analytical chemistry.

[24]  T. Köcher,et al.  Preprocessing of tandem mass spectrometric data to support automatic protein identification , 2003, Proteomics.

[25]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[26]  Knut Reinert,et al.  Transformation and other factors of the peptide mass spectrometry pairwise peak-list comparison process , 2005, BMC Bioinformatics.

[27]  D Fenyö,et al.  Identifying the proteome: software tools. , 2000, Current opinion in biotechnology.

[28]  Ruedi Aebersold,et al.  Advances in Proteome Analysis by Mass Spectrometry* , 2001, The Journal of Biological Chemistry.

[29]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[30]  Frank Schmidt,et al.  Iterative data analysis is the key for exhaustive analysis of peptide mass fingerprints from proteins separated by two-dimensional electrophoresis , 2003, Journal of the American Society for Mass Spectrometry.

[31]  J. Giles Internet encyclopaedias go head to head , 2005, Nature.

[32]  J. Chambers,et al.  The New S Language , 1989 .