OpenPepXL: An Open-Source Tool for Sensitive Identification of Cross-Linked Peptides in XL-MS

XL-MS has been recognized as an effective source of information about protein structures and interactions. OpenPepXL is a sensitive XL-MS identification software that reports from 7% to 40% more structurally validated cross-links than other tools on data sets with available high-resolution structures for cross-link validation. It is open source and has been built as part of the OpenMS suite of tools. OpenPepXL supports all common operating systems and open data formats. Graphical Abstract Highlights OpenPepXL is a new XL-MS identification tool with a high sensitivity. It is available for all common operating systems and remote computing environments. OpenPepXL is open source and supports open OpenPepXL is available as part of OpenMS data formats like mzML and mzIdentML. at https://www.openms.de/openpepxl. Cross-linking MS (XL-MS) has been recognized as an effective source of information about protein structures and interactions. In contrast to regular peptide identification, XL-MS has to deal with a quadratic search space, where peptides from every protein could potentially be cross-linked to any other protein. To cope with this search space, most tools apply different heuristics for search space reduction. We introduce a new open-source XL-MS database search algorithm, OpenPepXL, which offers increased sensitivity compared with other tools. OpenPepXL searches the full search space of an XL-MS experiment without using heuristics to reduce it. Because of efficient data structures and built-in parallelization OpenPepXL achieves excellent runtimes and can also be deployed on large compute clusters and cloud services while maintaining a slim memory footprint. We compared OpenPepXL to several other commonly used tools for identification of noncleavable labeled and label-free cross-linkers on a diverse set of XL-MS experiments. In our first comparison, we used a data set from a fraction of a cell lysate with a protein database of 128 targets and 128 decoys. At 5% FDR, OpenPepXL finds from 7% to over 50% more unique residue pairs (URPs) than other tools. On data sets with available high-resolution structures for cross-link validation OpenPepXL reports from 7% to over 40% more structurally validated URPs than other tools. Additionally, we used a synthetic peptide data set that allows objective validation of cross-links without relying on structural information and found that OpenPepXL reports at least 12% more validated URPs than other tools. It has been built as part of the OpenMS suite of tools and supports Windows, macOS, and Linux operating systems. OpenPepXL also supports the MzIdentML 1.2 format for XL-MS identification results. It is freely available under a three-clause BSD license at https://openms.org/openpepxl.

[1]  Hao Chi,et al.  A high-speed search engine pLink 2 with systematic evaluation for proteome-scale identification of cross-linked peptides , 2019, Nature Communications.

[2]  Robert J. Chalkley,et al.  Matching Cross-linked Peptide Spectra: Only as Good as the Worse Identification* , 2013, Molecular & Cellular Proteomics.

[3]  P. Carmeliet,et al.  PHD1 controls muscle mTORC1 in a hydroxylation-independent manner by stabilizing leucyl tRNA synthetase , 2020, Nature Communications.

[4]  Michal Sharon,et al.  Chemical cross‐linking and native mass spectrometry: A fruitful combination for structural biology , 2015, Protein science : a publication of the Protein Society.

[5]  Alexey I Nesvizhskii,et al.  MSFragger: ultrafast and comprehensive peptide identification in shotgun proteomics , 2017, Nature Methods.

[6]  J. Rappsilber,et al.  Optimizing the Parameters Governing the Fragmentation of Cross-Linked Peptides in a Tribrid Mass Spectrometer , 2017, Analytical chemistry.

[7]  Thomas Monecke,et al.  Crystal Structure of the Nuclear Export Receptor CRM1 in Complex with Snurportin1 and RanGTP , 2009, Science.

[8]  R. Aebersold,et al.  Crosslinking and Mass Spectrometry: An Integrated Technology to Understand the Structure and Function of Molecular Machines. , 2016, Trends in biochemical sciences.

[9]  Karl Mechtler,et al.  First Community-Wide, Comparative Cross-Linking Mass Spectrometry Study , 2019, Analytical chemistry.

[10]  Martin Eisenacher,et al.  The PRIDE database and related tools and resources in 2019: improving support for quantification data , 2018, Nucleic Acids Res..

[11]  Juan Antonio Vizcaíno,et al.  The ProteomeXchange consortium in 2017: supporting the cultural change in proteomics public data deposition , 2016, Nucleic Acids Res..

[12]  J. Rappsilber,et al.  Cross-linking mass spectrometry: methods and applications in structural, molecular and systems biology , 2018, Nature Structural & Molecular Biology.

[13]  J. Rappsilber,et al.  Quirks of Error Estimation in Cross-Linking/Mass Spectrometry , 2017, Analytical chemistry.

[14]  M. Senko,et al.  Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions , 1995, Journal of the American Society for Mass Spectrometry.

[15]  Albert J R Heck,et al.  Interrogating the architecture of protein assemblies and protein interaction networks by cross-linking mass spectrometry. , 2015, Current opinion in structural biology.

[16]  K. Reinert,et al.  OpenMS: a flexible open-source software platform for mass spectrometry data analysis , 2016, Nature Methods.

[17]  Michael Götze,et al.  StavroX—A Software for Analyzing Crosslinked Products in Protein Interaction Studies , 2011, Journal of The American Society for Mass Spectrometry.

[18]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[19]  Martin Beck,et al.  Xlink Analyzer: Software for analysis and visualization of cross-linking data in the context of three-dimensional structures , 2015, Journal of structural biology.

[20]  Wei Jiang,et al.  Xolik: finding cross-linked peptides with maximum paired scores in linear time , 2017, bioRxiv.

[21]  Michael J MacCoss,et al.  Kojak: efficient analysis of chemically cross-linked protein complexes. , 2015, Journal of proteome research.

[22]  Harald Barsnes,et al.  The mzIdentML Data Standard Version 1.2, Supporting Advances in Proteome Informatics* , 2017, Molecular & Cellular Proteomics.

[23]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[24]  Ruedi Aebersold,et al.  Corrigendum: Identification of cross-linked peptides from large sequence databases , 2008 .

[25]  Friedrich Förster,et al.  False discovery rate estimation for cross-linked peptides identified by mass spectrometry , 2012, Nature Methods.

[26]  Paulo C Carvalho,et al.  TopoLink: evaluation of structural models using chemical crosslinking distance constraints , 2019, Bioinform..

[27]  Ruedi Aebersold,et al.  Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MS/MS and the xQuest/xProphet software pipeline , 2013, Nature Protocols.

[28]  Juan D Chavez,et al.  Chemical cross-linking with mass spectrometry: a tool for systems structural biology. , 2019, Current opinion in chemical biology.