A synthetic peptide library for benchmarking crosslinking mass spectrometry search engines

We have created synthetic peptide libraries to benchmark crosslinking mass spectrometry search engines for different types of crosslinker. The unique benefit of using a library is knowing which identified crosslinks are true and which are false. Here we have used mass spectrometry data generated from measurement of the peptide libraries to evaluate the most frequently applied search algorithms in crosslinking mass-spectrometry. When filtered to an estimated false discovery rate of 5%, false crosslink identification ranged from 5.2% to 11.3% for search engines with inbuilt validation strategies for error estimation. When different external validation strategies were applied to one single search output, false crosslink identification ranged from 2.4% to a surprising 32%, despite being filtered to an estimated 5% false discovery rate. Remarkably, the use of MS-cleavable crosslinkers did not reduce the false discovery rate compared to non-cleavable crosslinkers, results from which have far-reaching implications in structural biology. We anticipate that the datasets acquired during this research will further drive optimisation and development of search engines and novel data-interpretation technologies, thereby advancing our understanding of vital biological interactions.

[1]  Hyungwon Choi,et al.  Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics. , 2008, Journal of proteome research.

[2]  K. Mechtler,et al.  Optimized Fragmentation Improves the Identification of Peptides Cross-Linked by MS-Cleavable Reagents. , 2019, Journal of proteome research.

[3]  Markus Schneider,et al.  Chemical Cross-Linking Enables Drafting ClpXP Proximity Maps and Taking Snapshots of In Situ Interaction Networks. , 2019, Cell chemical biology.

[4]  Albert J R Heck,et al.  Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry , 2015, Nature Methods.

[5]  Zhen Yan,et al.  Structure of the voltage-gated calcium channel Cav1.1 at 3.6 angstrom resolution , 2016 .

[6]  M. Dong,et al.  Identification of cross-linked peptides from complex samples , 2012, Nature Methods.

[7]  Oleg Klykov,et al.  Flexible regions in the molecular architecture of Human fibrin clots structurally resolved by XL-MS and integrative structural modeling , 2019, bioRxiv.

[8]  William Stafford Noble,et al.  Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0 , 2016, Journal of The American Society for Mass Spectrometry.

[9]  Florian Busch,et al.  Quaternary Structure of the Tryptophan Synthase α-Subunit Homolog BX1 from Zea mays. , 2019, Journal of the American Society for Mass Spectrometry.

[10]  Arlo Z. Randall,et al.  Development of a Novel Cross-linking Strategy for Fast and Accurate Identification of Cross-linked Peptides of Protein Complexes* , 2010, Molecular & Cellular Proteomics.

[11]  Robert J. Chalkley,et al.  Matching Cross-linked Peptide Spectra: Only as Good as the Worse Identification* , 2013, Molecular & Cellular Proteomics.

[12]  Michael J MacCoss,et al.  Kojak: efficient analysis of chemically cross-linked protein complexes. , 2015, Journal of proteome research.

[13]  Andrea Sinz,et al.  Cleavable cross-linker for protein structure analysis: reliable identification of cross-linking products by tandem MS. , 2010, Analytical chemistry.

[14]  Philip C. Andrews,et al.  Quaternary Diamines as Mass Spectrometry Cleavable Crosslinkers for Protein Interactions , 2012, Journal of The American Society for Mass Spectrometry.

[15]  Jicheng Duan,et al.  A New in Vivo Cross-linking Mass Spectrometry Platform to Define Protein–Protein Interactions in Living Cells* , 2014, Molecular & Cellular Proteomics.

[16]  Rosa Viner,et al.  Optimized fragmentation schemes and data analysis strategies for proteome-wide cross-link identification , 2017, Nature Communications.

[17]  Otto Hudecz,et al.  The replicative helicase MCM recruits cohesin acetyltransferase ESCO2 to mediate centromeric sister chromatid cohesion , 2018, The EMBO journal.

[18]  Zhen Yan,et al.  Structure of the voltage-gated calcium channel Cav1.1 at 3.6 Å resolution , 2016, Nature.

[19]  Michael Götze,et al.  StavroX—A Software for Analyzing Crosslinked Products in Protein Interaction Studies , 2011, Journal of The American Society for Mass Spectrometry.

[20]  A. Venter,et al.  Journal of The American Society for Mass Spectrometry , 2005, Journal of the American Society for Mass Spectrometry.

[21]  Martin Eisenacher,et al.  The PRIDE database and related tools and resources in 2019: improving support for quantification data , 2018, Nucleic Acids Res..

[22]  Vicki H. Wysocki,et al.  Quaternary Structure of the Tryptophan Synthase α-Subunit Homolog BX1 from Zea mays , 2020 .

[23]  Albert J R Heck,et al.  Efficient and robust proteome-wide approaches for cross-linking mass spectrometry , 2018, Nature Protocols.

[24]  Michael Götze,et al.  Automated Assignment of MS/MS Cleavable Cross-Links in Protein 3D-Structure Analysis , 2014, Journal of The American Society for Mass Spectrometry.

[25]  Florian Stengel,et al.  Structural dynamics of the E6AP/UBE3A-E6-p53 enzyme-substrate complex , 2018, Nature Communications.

[26]  Amber L. Couzens,et al.  The CRAPome: a Contaminant Repository for Affinity Purification Mass Spectrometry Data , 2013, Nature Methods.

[27]  Richard A. Scheltema,et al.  PhoX: An IMAC-Enrichable Cross-Linking Reagent , 2019, ACS central science.

[28]  Luis Mendoza,et al.  Trans‐Proteomic Pipeline, a standardized data processing pipeline for large‐scale reproducible proteomics informatics , 2015, Proteomics. Clinical applications.

[29]  J. Rappsilber,et al.  Quirks of Error Estimation in Cross-Linking/Mass Spectrometry , 2017, Analytical chemistry.

[30]  R. Aebersold,et al.  Crosslinking and Mass Spectrometry: An Integrated Technology to Understand the Structure and Function of Molecular Machines. , 2016, Trends in biochemical sciences.

[31]  Hao Chi,et al.  A high-speed search engine pLink 2 with systematic evaluation for proteome-scale identification of cross-linked peptides , 2019, Nature Communications.

[32]  Paolo Cifani,et al.  Optimized cross-linking mass spectrometry for in situ interaction proteomics , 2018, bioRxiv.

[33]  Kumar Yugandhar,et al.  Structure-based validation can drastically under-estimate error rate in proteome-wide cross-linking mass spectrometry studies , 2019, Nature Methods.

[34]  Claudio Iacobucci,et al.  A cross-linking/mass spectrometry workflow based on MS-cleavable cross-linkers and the MeroX software for studying protein structures and protein–protein interactions , 2018, Nature Protocols.

[35]  Juri Rappsilber,et al.  Quantitative cross-linking/mass spectrometry to elucidate structural changes in proteins and their complexes , 2018, Nature Protocols.