An algorithm for the identification of proteins using peptides with ragged N‐ or C‐termini generated by sequential endo‐ and exopeptidase digestions

We have developed an algorithm (MassDynSearch) for identifying proteins using a combination of peptide masses with small associated sequences (tags). Unlike the approach developed by Matthias Mann, ‘Tag searching’, in which the sequence tags are generated by gas phase fragmentation of peptides in a mass spectrometer, ‘Rag Tag’ searching uses peptide tags which are generated enzymatically or chemically. The protein is digested either chemically or with an endopeptidase and the resultant mixture is then subjected to partial exopeptidase degradation. The mixture is analyzed by matrix assisted laser desorption and ionization time of flight mass spectrometry and a list of intact peptide masses is generated, each associated with a set of degradation product masses which serve as unique tags. These ‘tagged masses’ are used as the input to an algorithm we have written, MassDynSearch, which searches protein and DNA databases for proteins which contain similar tagged motifs. The method is simple, rapid and can be fully automated. The main advantage of this approach is that the specificity of the initial digestion is unimportant since multiple peptides with tags are used to search the database. This is especially useful for proteins like membrane, cytoskeletal, and other proteins where specific endopeptidases are less efficient and lower specificity proteases such as chymotrypsin, pepsin, and elastase must be used.

[1]  C. Watanabe,et al.  Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[2]  M. Karas,et al.  Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. , 1988, Analytical chemistry.

[3]  M. Wilm,et al.  Electrospray and Taylor-Cone theory, Dole's beam of macromolecules at last? , 1994 .

[4]  Y. Takeda,et al.  Amino-acid sequence of a heat-stable enterotoxin produced by human enterotoxigenic Escherichia coli. , 1982, European journal of biochemistry.

[5]  D. Pappin,et al.  Peptide ladder sequencing by mass spectrometry using a novel, volatile degradation reagent. , 1994, Rapid communications in mass spectrometry : RCM.

[6]  Stephen A. Martin,et al.  Delayed extraction matrix‐assisted laser desorption time‐of‐flight mass spectrometry , 1995 .

[7]  B. Chait,et al.  Protein ladder sequencing. , 1993, Science.

[8]  P. Højrup,et al.  Rapid identification of proteins by peptide-mass fingerprinting , 1993, Current Biology.

[9]  T. Hunkapiller,et al.  Peptide mass maps: a highly informative approach to protein identification. , 1993, Analytical biochemistry.

[10]  M. Mann,et al.  Electrospray ionization for mass spectrometry of large biomolecules. , 1989, Science.

[11]  P. Højrup,et al.  Use of mass spectrometric molecular weight information to identify proteins in sequence databases. , 1993, Biological mass spectrometry.

[12]  E Carafoli,et al.  Protein identification in DNA databases by peptide mass fingerprinting , 1994, Protein science : a publication of the Protein Society.

[13]  G. Gonnet,et al.  Protein identification by mass profile fingerprinting. , 1993, Biochemical and biophysical research communications.

[14]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[15]  L. Hood,et al.  Electroblotting onto activated glass. High efficiency preparation of proteins from analytical sodium dodecyl sulfate-polyacrylamide gels for direct sequence analysis. , 1986, The Journal of biological chemistry.

[16]  L. Hood,et al.  Internal amino acid sequence analysis of proteins separated by one- or two-dimensional gel electrophoresis after in situ protease digestion on nitrocellulose. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[17]  D. Hochstrasser,et al.  Progress with proteome projects: why all proteins expressed by a genome should be identified and how to do it. , 1996, Biotechnology & genetic engineering reviews.

[18]  D. Inzé,et al.  Alterations in the phenotype of plant cells studied by NH(2)-terminal amino acid-sequence analysis of proteins electroblotted from two-dimensional gel-separated total extracts. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[19]  C. Schindler,et al.  Association of transcription factor APRF and protein kinase Jak1 with the interleukin-6 signal transducer gp130. , 1994, Science.

[20]  F. Regnier,et al.  C-terminal ladder sequencing via matrix-assisted laser desorption mass spectrometry coupled with carboxypeptidase Y time-dependent and concentration-dependent digestions. , 1995, Analytical chemistry.

[21]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[22]  M. Mann,et al.  Sequence patterns produced by incomplete enzymatic digestion or one‐step Edman degradation of peptide mixtures as probes for protein database searches , 1996, Electrophoresis.

[23]  M. Wilm,et al.  Error-tolerant identification of peptides in sequence databases by peptide sequence tags. , 1994, Analytical chemistry.

[24]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[25]  R. Aebersold,et al.  Synthesis of the protein-sequencing reagent 4-(3-pyridinylmethylaminocarboxypropyl) phenyl isothiocyanate and characterization of 4-(3-pyridinylmethylaminocarboxypropyl) phenylthiohydantoins. , 1995, Analytical biochemistry.

[26]  Richard D. Smith,et al.  Small volume and low flow-rate electrospray lonization mass spectrometry of aqueous samples , 1993 .

[27]  A. Cozzone,et al.  A method to identify individual proteins in four different two-dimensional gel electrophoresis systems: application to Escherichia coli ribosomal proteins. , 1979, Analytical biochemistry.

[28]  M. Quadroni,et al.  Concentration of, and SDS removal from proteins isolated from multiple two-dimensional electrophoresis gels. , 1997, European journal of biochemistry.