Protein function annotation by homology-based inference

With many genomes now sequenced, computational annotation methods to characterize genes and proteins from their sequence are increasingly important. The BioSapiens Network has developed tools to address all stages of this process, and here we review progress in the automated prediction of protein function based on protein sequence and structure.

[1]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[2]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[3]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[4]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[5]  S. Bryant,et al.  Threading a database of protein cores , 1995, Proteins.

[6]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[7]  K Fidelis,et al.  A large‐scale experiment to assess protein structure prediction methods , 1995, Proteins.

[8]  R. Laskowski SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. , 1995, Journal of molecular graphics.

[9]  R. Altman,et al.  Characterizing the microenvironment surrounding protein sites , 1995, Protein science : a publication of the Protein Society.

[10]  M. Swindells,et al.  Protein clefts in molecular recognition and function. , 1996, Protein science : a publication of the Protein Society.

[11]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[12]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[13]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[14]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[15]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[16]  G J Kleywegt,et al.  Recognition of spatial motifs in protein structures. , 1999, Journal of molecular biology.

[17]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[18]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[19]  S. Teichmann,et al.  Domain combinations in archaeal, eubacterial and eukaryotic proteomes. , 2001, Journal of molecular biology.

[20]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[21]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[22]  Stacy T. Knutson,et al.  Prediction of deleterious functional effects of amino acid mutations using a library of structure‐based function descriptors , 2003, Proteins.

[23]  Robert B. Russell,et al.  Annotation in three dimensions , 2003 .

[24]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[25]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[26]  Robert B. Russell,et al.  Annotation in three dimensions. PINTS: Patterns in Non-homologous Tertiary Structures , 2003, Nucleic Acids Res..

[27]  Ashish V. Tendulkar,et al.  Functional sites in protein families uncovered via an objective and automated graph theoretic approach. , 2003, Journal of molecular biology.

[28]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[29]  Martin Vingron,et al.  Large scale hierarchical clustering of protein sequences , 2005, BMC Bioinformatics.

[30]  Vladimir A. Ivanisenko,et al.  PDBSiteScan: a program for searching for active, binding and posttranslational modification sites in the 3D structures of proteins , 2004, Nucleic Acids Res..

[31]  D. Frishman,et al.  A domain interaction map based on phylogenetic profiling. , 2004, Journal of molecular biology.

[32]  Janet M. Thornton,et al.  The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data , 2004, Nucleic Acids Res..

[33]  K Henrick,et al.  Electronic Reprint Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions , 2022 .

[34]  Patrick Aloy,et al.  Ten thousand interactions for the molecular biologist , 2004, Nature Biotechnology.

[35]  Robert S. Ledley,et al.  PIRSF: family classification system at the Protein Information Resource , 2004, Nucleic Acids Res..

[36]  Jie Liang,et al.  pvSOAR: detecting similar surface patterns of pocket and void surfaces of amino acid residues on proteins , 2004, Nucleic Acids Res..

[37]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[38]  Sébastien Carrère,et al.  The ProDom database of protein domain families: more emphasis on 3D , 2004, Nucleic Acids Res..

[39]  Christopher J. Lee,et al.  Inferring protein domain interactions from databases of interacting proteins , 2005, Genome Biology.

[40]  Robert B. Russell,et al.  3did: interacting protein domains of known three-dimensional structure , 2004, Nucleic Acids Res..

[41]  Robert D. Finn,et al.  iPfam: visualization of protein?Cprotein interactions in PDB at domain and amino acid resolutions , 2005, Bioinform..

[42]  Anna Tramontano,et al.  The prediction of protein function at CASP6 , 2005, Proteins.

[43]  Gail J. Bartlett,et al.  Effective function annotation through catalytic residue conservation. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Rachel Kolodny,et al.  Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. , 2005, Journal of molecular biology.

[45]  Janet M. Thornton,et al.  ProFunc: a server for predicting protein function from 3D structure , 2005, Nucleic Acids Res..

[46]  Ruth Nussinov,et al.  SiteEngines: recognition and comparison of binding sites and protein–protein interfaces , 2005, Nucleic Acids Res..

[47]  D. Eisenberg,et al.  Inference of protein function from protein structure. , 2005, Structure.

[48]  Vladimir A. Ivanisenko,et al.  PDBSite: a database of the 3D structure of protein functional sites , 2004, Nucleic Acids Res..

[49]  Itay Mayrose,et al.  ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures , 2005, Nucleic Acids Res..

[50]  Robert Petryszak,et al.  The predictive power of the CluSTr database , 2005, Bioinform..

[51]  Dmitrij Frishman,et al.  The MIPS mammalian protein?Cprotein interaction database , 2005, Bioinform..

[52]  N. Ben-Tal,et al.  The ConSurf‐HSSP database: The mapping of evolutionary conservation among homologs onto PDB structures , 2004, Proteins.

[53]  Ori Sasson,et al.  ProtoNet 4.0: A hierarchical classification of one million protein sequences , 2004, Nucleic Acids Res..

[54]  Anna Tramontano,et al.  Revisiting the prediction of protein function at CASP6 , 2006, The FEBS journal.

[55]  Patricia C. Babbitt,et al.  Automated discovery of 3D motifs for protein function annotation , 2006, Bioinform..

[56]  Adam Godzik,et al.  JAFA: a protein function annotation meta-server , 2006, Nucleic Acids Res..

[57]  Peer Bork,et al.  SMART 5: domains in the context of genomes and networks , 2005, Nucleic Acids Res..

[58]  Raja Jothi,et al.  Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. , 2006, Journal of molecular biology.

[59]  Anna Tramontano,et al.  The role of molecular modelling in biomedical research , 2006, FEBS letters.

[60]  Arun K. Ramani,et al.  How complete are current yeast and human protein-interaction networks? , 2006, Genome Biology.

[61]  Janet M Thornton,et al.  Integrating biological data through the genome. , 2006, Human molecular genetics.

[62]  Jie Liang,et al.  CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues , 2006, Nucleic Acids Res..

[63]  Frances M. G. Pearl,et al.  CATHEDRAL: A Fast and Effective Algorithm to Predict Folds and Domain Boundaries from Multidomain Protein Structures , 2007, PLoS Comput. Biol..

[64]  Janet M Thornton,et al.  Towards fully automated structure-based function prediction in structural genomics: a case study. , 2007, Journal of molecular biology.

[65]  Janusz M Bujnicki,et al.  SURF’s UP! — Protein classification by surface comparisons , 2007, Journal of Biosciences.

[66]  Robert D. Finn,et al.  Integrating sequence and structural biology with DAS , 2007, BMC Bioinformatics.

[67]  Krzysztof Fidelis,et al.  Progress from CASP6 to CASP7 , 2007, Proteins.

[68]  David Kim,et al.  Assessment of predictions submitted for the CASP7 domain prediction category , 2007, Proteins.

[69]  Cyrus Chothia,et al.  The SUPERFAMILY database in 2007: families and functions , 2006, Nucleic Acids Res..

[70]  Lydia E. Kavraki,et al.  Prediction of enzyme function based on 3D templates of evolutionarily important amino acids , 2008, BMC Bioinformatics.

[71]  Christine A. Orengo,et al.  Methods of remote homology detection can be combined to increase coverage by 10% in the midnight zone , 2007, Bioinform..

[72]  Nathan Linial,et al.  EVEREST: a collection of evolutionary conserved protein domains , 2006, Nucleic Acids Res..

[73]  Narmada Thanki,et al.  CDD: a conserved domain database for interactive domain family analysis , 2006, Nucleic Acids Res..

[74]  Zbyszek Otwinowski,et al.  Structural genomics: keeping up with expanding knowledge of the protein universe. , 2007, Current opinion in structural biology.

[75]  Robert D. Finn,et al.  New developments in the InterPro database , 2007, Nucleic Acids Res..

[76]  Andreas Martin Lisewski,et al.  De-Orphaning the Structural Proteome through Reciprocal Comparison of Evolutionarily Important Structural Features , 2008, PloS one.

[77]  Nigel J. Martin,et al.  Gene3D: comprehensive structural and functional annotation of genomes , 2007, Nucleic Acids Res..

[78]  Dmitrij Frishman,et al.  DIMA 2.0—predicted and known domain interactions , 2008, Nucleic Acids Res..

[79]  Teresa M. Przytycka,et al.  DOMINE: a database of protein domain interactions , 2007, Nucleic Acids Res..

[80]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[81]  Christian J. A. Sigrist,et al.  Nucleic Acids Research Advance Access published November 14, 2007 The 20 years of PROSITE , 2007 .

[82]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[83]  J. Pickleman,et al.  M Me Et Th Ho Od Ds S , 2022 .