3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources

While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modelling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.

[1]  Damiano Piovesan,et al.  PDB ProtVista: A reusable and open-source sequence feature viewer , 2022, bioRxiv.

[2]  I. Sillitoe,et al.  Comprehensive Collection and Prediction of ABC Transmembrane Protein Structures in the AI Era of Structural Biology , 2022, bioRxiv.

[3]  Su Datt Lam,et al.  AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms , 2022, bioRxiv.

[4]  D. Patel,et al.  Cryo-EM structure of DNA-bound Smc5/6 reveals DNA clamping enabled by multi-subunit conformational changes , 2022, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Maximilian T. Strauss,et al.  The structural context of posttranslational modifications at a proteome-wide scale , 2022, PLoS biology.

[6]  Wilson Wen Bin Goh,et al.  Data considerations for predictive modeling applied to the discovery of bioactive natural products. , 2022, Drug discovery today.

[7]  Jiajun Li,et al.  A Pharmacoinformatics Analysis of Artemisinin Targets and de novo Design of Hits for Treating Ulcerative Colitis , 2022, Frontiers in Pharmacology.

[8]  T. Walz,et al.  Cryo-EM structure of the human CST–Polα/primase complex in a recruitment state , 2021, Nature Structural & Molecular Biology.

[9]  Roland L. Dunbrack,et al.  PDBe-KB: collaboratively defining the biological context of structural data , 2021, Nucleic Acids Res..

[10]  R. Joosten,et al.  AlphaFill: enriching the AlphaFold models with ligands and co-factors , 2021, bioRxiv.

[11]  S. Keeney,et al.  Computed structures of core eukaryotic protein complexes , 2021, Science.

[12]  Douglas E. V. Pires,et al.  A structural biology community assessment of AlphaFold2 applications , 2021, bioRxiv.

[13]  Gyu Rie Lee,et al.  Accurate prediction of protein structures and interactions using a 3-track neural network , 2021, Science.

[14]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[15]  A. Lupas,et al.  High‐accuracy protein structure prediction in CASP14 , 2021, Proteins.

[16]  N. Ben-Tal,et al.  Integrative structural biology in the era of accurate structure prediction. , 2021, Journal of molecular biology.

[17]  Sameer Velankar,et al.  PDBe aggregated API: programmatic access to an integrative knowledge graph of molecular structure data , 2021, Bioinform..

[18]  Radka Svobodová Vareková,et al.  Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures , 2021, Nucleic Acids Res..

[19]  Na Wei,et al.  Enzyme Discovery and Engineering for Sustainable Plastic Recycling. , 2021, Trends in biotechnology.

[20]  David M. A. Martin,et al.  Alignment of Biological Sequences with Jalview. , 2020, Methods in molecular biology.

[21]  Peter B. McGarvey,et al.  UniProt: the universal protein knowledgebase in 2021 , 2020, Nucleic Acids Res..

[22]  E. McDonagh,et al.  Open Targets Platform: supporting systematic drug–target identification and prioritisation , 2020, Nucleic Acids Res..

[23]  OUP accepted manuscript , 2021, Nucleic Acids Research.

[24]  Silvio C. E. Tosatto,et al.  PED in 2021: a major update of the protein ensemble database for intrinsically disordered proteins , 2020, Nucleic acids research.

[25]  Radka Svobodová Vareková,et al.  PDBe: improved findability of macromolecular structure data in the PDB , 2019, Nucleic Acids Res..

[26]  Alfonso Valencia,et al.  PDBe-KB: a community-driven resource for structural and functional annotations , 2019, Nucleic Acids Res..

[27]  Torsten Schwede,et al.  QMEANDisCo—distance constraints applied on model quality estimation , 2019, Bioinform..

[28]  Daniel W. A. Buchan,et al.  The Genome3D Consortium for Structural Annotations of Selected Model Organisms. , 2020, Methods in molecular biology.

[29]  Dmitri I Svergun,et al.  SASBDB: Towards an automatically curated and validated repository for biological scattering data , 2019, Protein science : a publication of the Protein Society.

[30]  Sangdun Choi,et al.  A Structure-Based Drug Discovery Paradigm , 2019, International journal of molecular sciences.

[31]  Cole H. Christie,et al.  Protein Data Bank: the single global archive for 3D macromolecular structure data , 2018, Nucleic Acids Res..

[32]  Kliment Olechnovic,et al.  Comparative analysis of methods for evaluation of protein models against native structures , 2018, Bioinform..

[33]  Torsten Schwede,et al.  SWISS-MODEL: homology modelling of protein structures and complexes , 2018, Nucleic Acids Res..

[34]  Masasuke Yoshida,et al.  Perspective: Structural fluctuation of protein and Anfinsen's thermodynamic hypothesis. , 2018, The Journal of chemical physics.

[35]  Torsten Schwede,et al.  The SWISS-MODEL Repository—new features and functionality , 2016, Nucleic Acids Res..

[36]  D. Svergun,et al.  A practical guide to small angle X‐ray scattering (SAXS) of flexible and intrinsically disordered proteins , 2015, FEBS letters.

[37]  Torsten Schwede,et al.  Protein modeling: what happened to the "protein structure gap"? , 2013, Structure.

[38]  Marco Biasini,et al.  lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests , 2013, Bioinform..

[39]  Clemens Vonrhein,et al.  Exploiting structure similarity in refinement: automated NCS and target-structure restraints in BUSTER , 2012, Acta crystallographica. Section D, Biological crystallography.

[40]  Ben M. Webb,et al.  ModBase, a database of annotated comparative protein structure models and associated resources , 2013, Nucleic Acids Res..

[41]  David A. Lee,et al.  Predicting protein function from sequence and structure , 2007, Nature Reviews Molecular Cell Biology.

[42]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.