Complex Portal 2022: new curation frontiers

Abstract The Complex Portal (www.ebi.ac.uk/complexportal) is a manually curated, encyclopaedic database of macromolecular complexes with known function from a range of model organisms. It summarizes complex composition, topology and function along with links to a large range of domain-specific resources (i.e. wwPDB, EMDB and Reactome). Since the last update in 2019, we have produced a first draft complexome for Escherichia coli, maintained and updated that of Saccharomyces cerevisiae, added over 40 coronavirus complexes and increased the human complexome to over 1100 complexes that include approximately 200 complexes that act as targets for viral proteins or are part of the immune system. The display of protein features in ComplexViewer has been improved and the participant table is now colour-coordinated with the nodes in ComplexViewer. Community collaboration has expanded, for example by contributing to an analysis of putative transcription cofactors and providing data accessible to semantic web tools through Wikidata which is now populated with manually curated Complex Portal content through a new bot. Our data license is now CC0 to encourage data reuse. Users are encouraged to get in touch, provide us with feedback and send curation requests through the ‘Support’ link.

[1]  S. Orchard,et al.  Integration of transcription coregulator complexes with sequence-specific DNA-binding factor interactomes. , 2021, Biochimica et biophysica acta. Gene regulatory mechanisms.

[2]  Suzanne M. Paley,et al.  The EcoCyc Database in 2021 , 2021, Frontiers in Microbiology.

[3]  Jasmine Y. Young,et al.  wwPDB biocuration: on the front line of structural biology , 2021, Nature Methods.

[4]  Sidney M. Bell,et al.  cellxgene: a performant, scalable exploration platform for high dimensional sparse matrices , 2021, bioRxiv.

[5]  Sylvie Ricard-Blum,et al.  Building Protein‐Protein and Protein‐Glycosaminoglycan Interaction Networks Using MatrixDB, the Extracellular Matrix Interaction Database , 2021, Current protocols.

[6]  W. Chiu,et al.  Evolution of standardization and dissemination of cryo-EM structures and data jointly by the community, PDB, and EMDB , 2021, The Journal of biological chemistry.

[7]  Anushya Muruganujan,et al.  The Gene Ontology resource: enriching a GOld mine , 2020, Nucleic Acids Res..

[8]  Elisabet Barrera,et al.  Towards a unified open access dataset of molecular interactions , 2020, Nature Communications.

[9]  Peter B. McGarvey,et al.  UniProt: the universal protein knowledgebase in 2021 , 2020, Nucleic Acids Res..

[10]  Alexander R. Pico,et al.  WikiPathways: connecting communities , 2020, Nucleic Acids Res..

[11]  E. McDonagh,et al.  Open Targets Platform: supporting systematic drug–target identification and prioritisation , 2020, Nucleic Acids Res..

[12]  P. Aloy,et al.  Analysing the yeast complexome—the Complex Portal rising to the challenge , 2020, bioRxiv.

[13]  A. Ignatchenko,et al.  A GO catalogue of human DNA-binding transcription factors , 2020, bioRxiv.

[14]  Robert D. Finn,et al.  RNAcentral 2021: secondary structure integration, improved sequence search and new member databases , 2020, Nucleic Acids Res..

[15]  Kara Dolinski,et al.  The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions , 2020, Protein science : a publication of the Protein Society.

[16]  E. Marcotte,et al.  hu.MAP 2.0: integration of over 15,000 proteomic experiments builds a global compendium of human multiprotein assemblies , 2020, bioRxiv.

[17]  N Del-Toro,et al.  The IMEx Coronavirus interactome: an evolving map of Coronaviridae-Host molecular interactions , 2020, bioRxiv.

[18]  Hiroaki Kitano,et al.  COVID-19 Disease Map, building a computational repository of SARS-CoV-2 virus-host interaction mechanisms , 2020, Scientific Data.

[19]  Charles Tapley Hoyt,et al.  The Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST) , 2020, Bioinform..

[20]  Thomas Shafee,et al.  Wikidata as a knowledge graph for the life sciences , 2020, eLife.

[21]  S. Leibler,et al.  Lessons from equilibrium statistical physics regarding the assembly of protein complexes , 2019, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Livia Perfetto,et al.  SIGNOR 2.0, the SIGnaling Network Open Resource 2.0: 2019 update , 2019, Nucleic Acids Res..

[23]  B. Spira,et al.  Phosphate uptake by the phosphonate transport system PhnCDE , 2019, BMC Microbiology.

[24]  Wojciech Michalak,et al.  ComplexBrowser: a tool for identification and quantification of protein complexes in large scale proteomics datasets , 2019, bioRxiv.

[25]  Edith D. Wong,et al.  Integration of macromolecular complex data into the Saccharomyces Genome Database , 2019, Database J. Biol. Databases Curation.

[26]  Gene-Wei Li,et al.  Production of Protein-Complex Components Is Stoichiometric and Lacks General Feedback Regulation in Eukaryotes. , 2018, Cell systems.

[27]  Andreas Ruepp,et al.  CORUM: the comprehensive resource of mammalian protein complexes—2019 , 2018, Nucleic Acids Res..

[28]  Henning Hermjakob,et al.  Complex Portal 2018: extended content and enhanced visualization tools for macromolecular complexes , 2018, Nucleic Acids Res..

[29]  Anne Morgat,et al.  Updates in Rhea: SPARQLing biochemical reaction data , 2018, Nucleic Acids Res..

[30]  Edward M. Marcotte,et al.  Ancestral reconstruction of protein interaction networks , 2018, bioRxiv.

[31]  L. Salwínski,et al.  Encompassing new use cases - level 3.0 of the HUPO-PSI format for molecular interactions , 2018, BMC Bioinformatics.

[32]  Gos Micklem,et al.  ComplexViewer: visualization of curated macromolecular complexes , 2017, Bioinform..

[33]  Egon L. Willighagen,et al.  Scholia, Scientometrics and Wikidata , 2017, ESWC.

[34]  Christoph Steinbeck,et al.  ChEBI in 2016: Improved services and an expanding collection of metabolites , 2015, Nucleic Acids Res..

[35]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[36]  Henning Hermjakob,et al.  The Reactome pathway knowledgebase , 2013, Nucleic Acids Res..

[37]  Rafael C. Jimenez,et al.  The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases , 2013, Nucleic Acids Res..

[38]  Gary D Bader,et al.  The Genetic Landscape of a Cell , 2010, Science.

[39]  Chris T. A. Evelo,et al.  The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services , 2010, BMC Bioinformatics.

[40]  Pornpimol Charoentong,et al.  ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks , 2009, Bioinform..

[41]  Shoshana J. Wodak,et al.  CYGD: the Comprehensive Yeast Genome Database , 2004, Nucleic Acids Res..

[42]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[43]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[44]  J. Hoch,et al.  The Protein Data Bank Archive. , 2021, Methods in molecular biology.

[45]  Heiko Paulheim,et al.  The Semantic Web: ESWC 2017 Satellite Events , 2017, Lecture Notes in Computer Science.