A mass spectrometry– based hybrid method for structural modeling of protein complexes

documented, combining information from all four MS-based approaches with modeling has not been reported to our knowledge. Here we describe a generic hybrid structural biology method that integrates orthogonal data sets for the same protein complex generated by native MS, label-free quantification (LFQ) by LC-MS/MS, IM-MS and CX-MS. This hybrid method differs from other approaches because of its ability to generate orthogonal data sets and to computationally integrate diverse MS data sets with different levels of resolution and information content from the same sample. Overall, the method enables accurate prediction of multiprotein and heterogeneous complexes when high-resolution information of the individual subunits is used, and it consists of experimental techniques that require only low microgram sample amounts and that exhibit high measuring speed and tolerance for heterogeneous sample environments8. The method involves four steps: (i) protein purification and data collection by the respective MS technique (aliquots of the purified protein complex are first analyzed by LFQ and CX-MS experiments and then, after buffer exchange, IM-MS and native MS (Online Methods)); (ii) encoding MS data into restraints; (iii) structure prediction by iterative sampling and scoring of models; and (iv) ensemble analysis to generate most likely structures (Fig. 1a and Online Methods). We developed and benchmarked the method using three wellcharacterized complexes exhibiting distinct topologies: methane monooxygenase hydroxylase (MMOH) from Methylococcus capsulatus, toluene/o-xylene monooxygenase hydroxylase (ToMOH) from Pseudomonas stutzeri and urease from Klebsiella aerogenes (Online Methods, Supplementary Note 1 and Supplementary Fig. 1). Native MS allowed us to determine the stoichiometry of the complexes and their subunit connectivities5 (Supplementary Fig. 2). IM-MS added orientationally averaged CCSs9, and CX-MS allowed us to identify high-confidence interand intraprotein interactions10–12. Using these MS-based restraints allowed sampling of complex models. Next we refined the models using an optimization step and ranked the models with a weighted scoring function. We selected representative structures from the pool of highly ranked models upon pairwise clustering of their -carbon r.m.s. deviations (C RMSDs). A refinement step ensured physical interactions between subunits (Online Methods). For all complexes we found good agreement (RMSDs < 12 Å) of the best-scored models with their native structures (Fig. 1b,c and Supplementary Figs. 3–7). To evaluate contributions of each restraint for predicting near-native structures, we carried out statistical tests using receiver operating characteristics (ROCs) (Supplementary Note 2). A mass spectrometry– based hybrid method for structural modeling of protein complexes