Modls: Post-translational modification localization scoring with automatic specificity expansion

Probability-based localisation scoring of fragment mass-spectrum phosphorylation site identifications has become common practice to confirm search engine modification assignments, and indicate the degree of certainty with which they are defined. Localisation of modifications other than phosphorylation is also required but is less commonly supported by current tools. These other modifications, such as hydroxylation, may have broad aminoacid specificity, and can be misassigned when the correct specificity is not considered in an MS database search. In addition, localisation software is often specific to a particular MS/MS search engine, and cannot be used to localise modifications identified by multiple search engines. ModLS, a new tool within our freely-available Central Proteomics Facilities Pipeline (CPFP), applies a localisation scoring method to arbitrary post-translational modifications (PTMs). As well as localising PTMs based on amino-acid specificities are included in the initial search, ModLS can automatically consider additional specificities from UniMod. This can help avoid ‘correct modification, incorrect amino-acid’ errors which can occur when data is searched using only a subset of PTM specificities. Localisation scoring can be performed on the results from any search engine incorporated within the pipeline, or where the output of individual search engines is combined to give increased coverage. We demonstrate the performance of ModLS using a publicly available phosphorylated peptide dataset, showing that it outperforms the recently characterised Mascot Delta Score approach for CID and MSA data, and is comparable for HCD data. In addition, we show the utility of automatic specificity expansion using hydroxylated and methylated peptide data. ModLS is a user-friendly localisation tool for arbitrary modifications. Its inclusion within CPFP allows PTM localisation to be performed quickly and easily on large or small result sets, from multiple search engines. Specificity expansion, introduced in ModLS, allows misassignments of modifications due to incomplete consideration of specificities to be identified and minimised.

[1]  P. Ratcliffe,et al.  Proteomics-based Identification of Novel Factor Inhibiting Hypoxia-inducible Factor (FIH) Substrates Indicates Widespread Asparaginyl Hydroxylation of Ankyrin Repeat Domain-containing Proteins*S⃞ , 2009, Molecular & Cellular Proteomics.

[2]  Martin R Larsen,et al.  Multidimensional strategy for sensitive phosphoproteomics incorporating protein prefractionation combined with SIMAC, HILIC, and TiO(2) chromatography applied to proximal EGF signaling. , 2011, Journal of proteome research.

[3]  Albert J R Heck,et al.  Enhancing the Identification of Phosphopeptides from Putative Basophilic Kinase Substrates Using Ti (IV) Based IMAC Enrichment* , 2011, Molecular & Cellular Proteomics.

[4]  Christopher J. Schofield,et al.  Asparagine and Aspartate Hydroxylation of the Cytoskeletal Ankyrin Family Is Catalyzed by Factor-inhibiting Hypoxia-inducible Factor , 2010, The Journal of Biological Chemistry.

[5]  L. Deterding,et al.  Mass spectrometric identification of oxidative modifications of tryptophan residues in proteins: Chemical artifact or post-translational modification? , 2010, Journal of the American Society for Mass Spectrometry.

[6]  E. Laczko,et al.  Phosphoproteome profile of Fusarium graminearum grown in vitro under nonlimiting conditions , 2012, Proteomics.

[7]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[8]  Steven P Gygi,et al.  A probability-based approach for high-throughput protein phosphorylation analysis and site localization , 2006, Nature Biotechnology.

[9]  Brendan MacLean,et al.  General framework for developing and evaluating database scoring algorithms using the TANDEM search engine , 2006, Bioinform..

[10]  B. Kuster,et al.  Confident Phosphorylation Site Localization Using the Mascot Delta Score , 2010, Molecular & Cellular Proteomics.

[11]  William Stafford Noble,et al.  Posterior error probabilities and false discovery rates: two sides of the same coin. , 2008, Journal of proteome research.

[12]  R. Aebersold,et al.  A uniform proteomics MS/MS analysis platform utilizing open XML file formats , 2005, Molecular systems biology.

[13]  M. Mann,et al.  Improved peptide identification in proteomics by two consecutive stages of mass spectrometric fragmentation. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Martin Zeller,et al.  SLoMo: automated site localization of modifications from ETD/ECD mass spectra. , 2009, Journal of proteome research.

[15]  T. Köcher,et al.  Universal and confident phosphorylation site localization using phosphoRS. , 2011, Journal of proteome research.

[16]  K. Clauser,et al.  Modification Site Localization Scoring: Strategies and Performance , 2012, Molecular & Cellular Proteomics.

[17]  The UniProt Consortium,et al.  Reorganizing the protein space at the Universal Protein Resource (UniProt) , 2011, Nucleic Acids Res..

[18]  W. Kaelin,et al.  Oxygen sensing by metazoans: the central role of the HIF hydroxylase pathway. , 2008, Molecular cell.

[19]  Benjamin Thomas,et al.  CPFP: a central proteomics facilities pipeline , 2010, Bioinform..

[20]  D. Creasy,et al.  Unimod: Protein modifications for mass spectrometry , 2004, Proteomics.

[21]  Peter R Baker,et al.  Modification Site Localization Scoring Integrated into a Search Engine* , 2011, Molecular & Cellular Proteomics.

[22]  B. Thomas,et al.  A Method for Large-scale Identification of Protein Arginine Methylation* , 2012, Molecular & Cellular Proteomics.

[23]  P. Ratcliffe,et al.  Factor-inhibiting hypoxia-inducible factor (FIH) catalyses the post-translational hydroxylation of histidinyl residues within ankyrin repeat domains , 2011, The FEBS journal.

[24]  M. Mann,et al.  Global, In Vivo, and Site-Specific Phosphorylation Dynamics in Signaling Networks , 2006, Cell.