Protein complexes in cells by AI‐assisted structural proteomics

Accurately modeling the structures of proteins and their complexes using artificial intelligence is revolutionizing molecular biology. Experimental data enables a candidate-based approach to systematically model novel protein assemblies. Here, we use a combination of in-cell crosslinking mass spectrometry, cofractionation mass spectrometry (CoFrac-MS) to identify protein-protein interactions in the model Gram-positive bacterium Bacillus subtilis. We show that crosslinking interactions prior to cell lysis reveals protein interactions that are often lost upon cell lysis. We predict the structures of these protein interactions and others in the SubtiWiki database with AlphaFold-Multimer and, after controlling for the false-positive rate of the predictions, we propose novel structural models of 153 dimeric and 14 trimeric protein assemblies. Crosslinking MS data independently validates the AlphaFold predictions and scoring. We report and validate novel interactors of central cellular machineries that include the ribosome, RNA polymerase and pyruvate dehydrogenase, assigning function to several uncharacterized proteins. Our approach uncovers protein-protein interactions inside intact cells, provides structural insight into their interaction interface, and is applicable to genetically intractable organisms, including pathogenic bacteria.

[1]  A. Elofsson,et al.  Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search , 2022, bioRxiv.

[2]  T. Ideker,et al.  Understudied proteins: opportunities and challenges for functional proteomics , 2022, Nature Methods.

[3]  Jerry M. Parks,et al.  AF2Complex predicts direct physical interactions in multimeric proteins with deep learning , 2022, Nature Communications.

[4]  J. Stülke,et al.  The current state of SubtiWiki, the database for the model organism Bacillus subtilis , 2021, Nucleic Acids Res..

[5]  A. Elofsson,et al.  Improved prediction of protein-protein interactions using AlphaFold2 and extended multiple-sequence alignments , 2021, bioRxiv.

[6]  S. Ovchinnikov,et al.  ColabFold: making protein folding accessible to all , 2022, Nature Methods.

[7]  J. Rappsilber,et al.  Improved Peptide Backbone Fragmentation Is the Primary Advantage of MS-Cleavable Crosslinkers , 2021, bioRxiv.

[8]  S. Keeney,et al.  Computed structures of core eukaryotic protein complexes , 2021, Science.

[9]  A. Leitner,et al.  Towards a structurally resolved human protein interaction network , 2021, bioRxiv.

[10]  D. Hassabis,et al.  Protein complex prediction with AlphaFold-Multimer , 2021, bioRxiv.

[11]  U. Völker,et al.  The Bacillus subtilis Minimal Genome Compendium. , 2021, ACS synthetic biology.

[12]  Douglas E. V. Pires,et al.  A structural biology community assessment of AlphaFold2 applications , 2021, bioRxiv.

[13]  Nadezhda T. Doncheva,et al.  Correction to ‘The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets’ , 2021, Nucleic acids research.

[14]  E. Levy,et al.  PDB-wide identification of physiological hetero-oligomeric assemblies based on conserved quaternary structure geometry , 2021, Structure.

[15]  L. Foster,et al.  Meta-analysis defines principles for the design and analysis of co-fractionation mass spectrometry experiments , 2021, Nature Methods.

[16]  P. Sykacek,et al.  PCprophet: a framework for protein complex prediction and differential analysis using proteomic data , 2021, Nature Methods.

[17]  B. Berks,et al.  The DNA transporter ComEC has metal‐dependent nuclease activity that is important for natural transformation , 2021, Molecular microbiology.

[18]  Anna G. Green,et al.  Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences , 2021, Nature Communications.

[19]  Ben C. Collins,et al.  Systematic detection of functional proteoform groups from bottom-up proteomic datasets , 2020, Nature Communications.

[20]  Michael L. Waskom,et al.  Seaborn: Statistical Data Visualization , 2021, J. Open Source Softw..

[21]  Nadezhda T. Doncheva,et al.  The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets , 2020, Nucleic Acids Res..

[22]  Radka Svobodová Vareková,et al.  CATH: increased structural coverage of functional space , 2020, Nucleic Acids Res..

[23]  Juri Rappsilber,et al.  Reliable identification of protein-protein interactions by crosslinking mass spectrometry , 2020, Nature Communications.

[24]  OUP accepted manuscript , 2021, Nucleic Acids Research.

[25]  Yansheng Liu,et al.  SECAT: Quantifying Protein Complex Dynamics across Cell States by Network-Centric Analysis of SEC-SWATH-MS Profiles. , 2020, Cell systems.

[26]  Ilaria Iacobucci,et al.  From classical to new generation approaches: An excursus of -omics methods for investigation of protein-protein interaction networks. , 2020, Journal of proteomics.

[27]  Conrad C. Huang,et al.  UCSF ChimeraX: Structure visualization for researchers, educators, and developers , 2020, Protein science : a publication of the Protein Society.

[28]  J. Errington,et al.  Microbe Profile: Bacillus subtilis: model organism for cellular development, and industrial workhorse , 2020, Microbiology.

[29]  D. Tegunov,et al.  In-cell architecture of an actively transcribing-translating expressome , 2020, Science.

[30]  J. Gough,et al.  The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures , 2019, Nucleic Acids Res..

[31]  Mari L. Salmi,et al.  A Pan-plant Protein Complex Map Reveals Deep Conservation and Novel Assemblies , 2019, Cell.

[32]  Sven H. Giese,et al.  An integrated workflow for crosslinking mass spectrometry , 2019, Molecular systems biology.

[33]  S. Ha,et al.  Structural insights into stressosome assembly , 2019, IUCrJ.

[34]  Ákos T Kovács Bacillus subtilis. , 2019, Trends in microbiology.

[35]  Gary D Bader,et al.  EPIC: software toolkit for elution profile-based inference of protein complexes , 2019, Nature Methods.

[36]  D. Baker,et al.  Protein interaction networks revealed by proteome coevolution , 2019, Science.

[37]  Kliment Olechnovic,et al.  Comparative analysis of methods for evaluation of protein models against native structures , 2018, Bioinform..

[38]  Antoine Danchin,et al.  Bacillus subtilis, the model Gram‐positive bacterium: 20 years of annotation refinement , 2017, Microbial biotechnology.

[39]  Andrew Keller,et al.  Chemical Crosslinking Mass Spectrometry Analysis of Protein Conformations and Supercomplexes in Heart Tissue. , 2017, Cell systems.

[40]  R Lamb,et al.  Anthrax. , 1973, Red Book (2018).

[41]  Mário J. Ferreira,et al.  The MsmX ATPase plays a crucial role in pectin mobilization by Bacillus subtilis , 2017, PloS one.

[42]  T. Perry,et al.  Bacterial transformation: ComFA is a DNA‐dependent ATPase that forms complexes with ComFC and DprA , 2017, Molecular microbiology.

[43]  J. Rappsilber,et al.  Quirks of Error Estimation in Cross-Linking/Mass Spectrometry , 2017, Analytical chemistry.

[44]  Masaki Matsumoto,et al.  jPOSTrepo: an international standard data repository for proteomes , 2016, Nucleic Acids Res..

[45]  Jüergen Cox,et al.  The MaxQuant computational platform for mass spectrometry-based shotgun proteomics , 2016, Nature Protocols.

[46]  Devin K Schweppe,et al.  In Vivo Conformational Dynamics of Hsp90 and Its Interactors. , 2016, Cell chemical biology.

[47]  Raphael H. Michna,et al.  SubtiWiki 2.0—an integrated database for the model organism Bacillus subtilis , 2015, Nucleic Acids Res..

[48]  Greg W. Clark,et al.  Panorama of ancient metazoan macromolecular complexes , 2015, Nature.

[49]  Alexandre M. J. J. Bonvin,et al.  DisVis: quantifying and visualizing accessible interaction space of distance-restrained biomolecular complexes , 2015, Bioinform..

[50]  Daniel N. Wilson,et al.  Structure of the Bacillus subtilis 70S ribosome reveals the basis for species-specific stalling , 2015, Nature Communications.

[51]  Debora S. Marks,et al.  Sequence co-evolution gives 3D contacts and structures of protein complexes , 2014, bioRxiv.

[52]  P. Uetz,et al.  The binary protein-protein interaction landscape of Escherichia coli , 2014, Nature Biotechnology.

[53]  V. G. Panse,et al.  A new system for naming ribosomal proteins. , 2014, Current opinion in structural biology.

[54]  Lyle A. Simmons,et al.  DNA Repair and Genome Maintenance in Bacillus subtilis , 2012, Microbiology and Molecular Reviews.

[55]  Ruben Abagyan,et al.  Methods of protein structure comparison. , 2012, Methods in molecular biology.

[56]  Vincent Fromion,et al.  An expanded protein–protein interaction network in Bacillus subtilis reveals a group of hubs: Exploration by an integrative approach , 2011, Proteomics.

[57]  G. Rhie,et al.  The Poly-γ-d-Glutamic Acid Capsule of Bacillus anthracis Enhances Lethal Toxin Activity , 2011, Infection and Immunity.

[58]  M. Selbach,et al.  Global quantification of mammalian gene expression control , 2011, Nature.

[59]  P. Graumann,et al.  The Three-Layered DNA Uptake Machinery at the Cell Pole in Competent Bacillus subtilis Cells Is a Stable Complex , 2011, Journal of bacteriology.

[60]  U. Völker,et al.  Physical interactions between tricarboxylic acid cycle enzymes in Bacillus subtilis: evidence for a metabolon. , 2011, Metabolic engineering.

[61]  Arlo Z. Randall,et al.  Development of a Novel Cross-linking Strategy for Fast and Accurate Identification of Cross-linked Peptides of Protein Complexes* , 2010, Molecular & Cellular Proteomics.

[62]  Yang Zhang,et al.  How significant is a protein structure similarity with TM-score = 0.5? , 2010, Bioinform..

[63]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[64]  Roland L Dunbrack,et al.  Outcome of a workshop on applications of protein models in biomedical research. , 2009, Structure.

[65]  Roberto Kolter,et al.  A Widely Conserved Gene Cluster Required for Lactate Utilization in Bacillus subtilis and Its Involvement in Biofilm Formation , 2009, Journal of bacteriology.

[66]  U. Völker,et al.  Novel Activities of Glycolytic Enzymes in Bacillus subtilis , 2009, Molecular & Cellular Proteomics.

[67]  X. Pei,et al.  Snapshots of Catalysis in the E1 Subunit of the Pyruvate Dehydrogenase Multienzyme Complex , 2008, Structure.

[68]  J. Skolnick,et al.  Erratum: Scoring function for automated assessment of protein structure template quality (Proteins: Structure, Function and Genetics (2004) 57, (702-710)) , 2007 .

[69]  D. Dubnau,et al.  Multiple interactions among the competence proteins of Bacillus subtilis , 2007, Molecular microbiology.

[70]  K. Murata,et al.  Plant Cell Wall Degradation by Saprophytic Bacillus subtilis Strains: Gene Clusters Responsible for Rhamnogalacturonan Depolymerization , 2007, Applied and Environmental Microbiology.

[71]  Hideaki Nanamiya,et al.  A fail‐safe system for the ribosome under zinc‐limiting conditions in Bacillus subtilis , 2007, Molecular microbiology.

[72]  M. Mann,et al.  Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips , 2007, Nature Protocols.

[73]  I. Tanaka,et al.  Ammonia Channel Couples Glutaminase with Transamidase Reactions in GatCAB , 2006, Science.

[74]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[75]  M. Mock,et al.  CapE, a 47-Amino-Acid Peptide, Is Necessary for Bacillus anthracis Polyglutamate Capsule Synthesis , 2005, Journal of bacteriology.

[76]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[77]  S. Tokuyama,et al.  Characterization of the Bacillus subtilis ywsC gene, involved in gamma-polyglutamic acid production. , 2002, Journal of bacteriology.

[78]  M. Sung,et al.  Physiological and biochemical characteristics of poly γ-glutamate synthetase complex of Bacillus subtilis , 2001 .

[79]  Jörg Stülke,et al.  Coupling Physiology and Gene Regulation in Bacteria: The Phosphotransferase Sugar Uptake System Delivers the Signals , 1998, Naturwissenschaften.

[80]  D. Ladant,et al.  A bacterial two-hybrid system based on a reconstituted signal transduction pathway. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[81]  P. Stragier,et al.  Antibiotic-resistance cassettes for Bacillus subtilis. , 1995, Gene.

[82]  G. Rapoport,et al.  Salt stress is an environmental signal affecting degradative enzyme synthesis in Bacillus subtilis , 1995, Journal of bacteriology.

[83]  M. Débarbouillé,et al.  Interactions of wild-type and truncated LevR of Bacillus subtilis with the upstream activating sequence of the levanase operon. , 1994, Journal of molecular biology.

[84]  D. Wessel,et al.  A method for the quantitative recovery of protein in dilute solution in the presence of detergents and lipids. , 1984, Analytical biochemistry.

[85]  J. Sambrook,et al.  Molecular Cloning: A Laboratory Manual , 2001 .