The Structure–Function Linkage Database

The Structure–Function Linkage Database (SFLD, http://sfld.rbvi.ucsf.edu/) is a manually curated classification resource describing structure–function relationships for functionally diverse enzyme superfamilies. Members of such superfamilies are diverse in their overall reactions yet share a common ancestor and some conserved active site features associated with conserved functional attributes such as a partial reaction. Thus, despite their different functions, members of these superfamilies ‘look alike’, making them easy to misannotate. To address this complexity and enable rational transfer of functional features to unknowns only for those members for which we have sufficient functional information, we subdivide superfamily members into subgroups using sequence information, and lastly into families, sets of enzymes known to catalyze the same reaction using the same mechanistic strategy. Browsing and searching options in the SFLD provide access to all of these levels. The SFLD offers manually curated as well as automatically classified superfamily sets, both accompanied by search and download options for all hierarchical levels. Additional information includes multiple sequence alignments, tab-separated files of functional and other attributes, and sequence similarity networks. The latter provide a new and intuitively powerful way to visualize functional trends mapped to the context of sequence similarity.

[1]  Naryttza N. Diaz,et al.  The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes , 2005, Nucleic acids research.

[2]  Patricia C. Babbitt,et al.  Pythoscape: a framework for generation of large protein similarity networks , 2012, Bioinform..

[3]  Nozomi Nagano,et al.  EzCatDB: the Enzyme Catalytic-mechanism Database , 2004, Nucleic Acids Res..

[4]  The UniProt Consortium,et al.  Update on activities at the Universal Protein Resource (UniProt) in 2013 , 2012, Nucleic Acids Res..

[5]  Narmada Thanki,et al.  CDD: conserved domains and protein three-dimensional structure , 2012, Nucleic Acids Res..

[6]  Robert D. Finn,et al.  InterPro in 2011: new developments in the family and domain prediction database , 2011, Nucleic acids research.

[7]  Gemma L. Holliday,et al.  MACiE: exploring the diversity of biochemical reactions , 2011, Nucleic Acids Res..

[8]  P. Babbitt,et al.  Toward mechanistic classification of enzyme functions. , 2011, Current opinion in chemical biology.

[9]  Antje Chang,et al.  BRENDA in 2013: integrated reactions, kinetic data, enzyme function data, improved disease classification: new options and contents in BRENDA , 2012, Nucleic Acids Res..

[10]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[11]  Roman A. Laskowski,et al.  PDBsum new things , 2008, Nucleic Acids Res..

[12]  Patricia C. Babbitt,et al.  Structure-guided discovery of the metabolite carboxy-SAM that modulates tRNA function , 2013, Nature.

[13]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[14]  Anton J. Enright,et al.  BioLayout-an automatic graph layout algorithm for similarity visualization , 2001, Bioinform..

[15]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[16]  P. Babbitt,et al.  Enzyme (re)design: lessons from natural evolution and computation. , 2009, Current opinion in chemical biology.

[17]  Stephen K Burley PDB40: The Protein Data Bank celebrates its 40th birthday , 2013, Biopolymers.

[18]  E. Webb Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. , 1992 .

[19]  Conrad C. Huang,et al.  Leveraging enzyme structure-function relationships for functional inference and experimental design: the structure-function linkage database. , 2006, Biochemistry.

[20]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[21]  Thomas E. Ferrin,et al.  Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies , 2009, PloS one.

[22]  Ni Li,et al.  Gene Ontology Annotations and Resources , 2012, Nucleic Acids Res..

[23]  Shoshana D. Brown,et al.  Homology models guide discovery of diverse enzyme specificities among dipeptide epimerases in the enolase superfamily , 2012, Proceedings of the National Academy of Sciences.

[24]  Janet M. Thornton,et al.  The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data , 2004, Nucleic Acids Res..

[25]  Andrew G. McDonald,et al.  ExplorEnz: the primary source of the IUBMB enzyme list , 2008, Nucleic Acids Res..

[26]  C. Chothia,et al.  Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. , 2001, Journal of molecular biology.

[27]  David A. Lee,et al.  New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures , 2012, Nucleic Acids Res..

[28]  L. L. Lloyd,et al.  Enzyme nomenclature — Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology: Academic Press Ltd, London, UK, 1992. xiii + 862 pp. Price £40.00. ISBN 0-12-227165-3 , 1994 .

[29]  Ian Sillitoe,et al.  FunTree: a resource for exploring the functional evolution of structurally defined enzyme superfamilies , 2011, Nucleic Acids Res..

[30]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[31]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[32]  Inna Dubchak,et al.  MicrobesOnline: an integrated portal for comparative and functional genomics , 2009, Nucleic Acids Res..

[33]  P. Babbitt,et al.  Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies. , 2001, Annual review of biochemistry.

[34]  Owen White,et al.  The TIGRFAMs database of protein families , 2003, Nucleic Acids Res..

[35]  Ian Sillitoe,et al.  Gene3D: a domain-based resource for comparative genomics, functional annotation and protein network analysis , 2011, Nucleic Acids Res..

[36]  Patricia C. Babbitt,et al.  Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies , 2009, PLoS Comput. Biol..

[37]  Shoshana D. Brown,et al.  A gold standard set of mechanistically diverse enzyme superfamilies , 2006, Genome Biology.

[38]  Shoshana D. Brown,et al.  Inference of Functional Properties from Large-scale Analysis of Enzyme Superfamilies* , 2011, The Journal of Biological Chemistry.

[39]  Daniel W. A. Buchan,et al.  A large-scale evaluation of computational protein function prediction , 2013, Nature Methods.

[40]  Cathy H. Wu,et al.  PIRSF Family Classification System for Protein Functional and Evolutionary Analysis , 2006, Evolutionary bioinformatics online.

[41]  Rolf Apweiler,et al.  CluSTr: a database of clusters of SWISS-PROT+TrEMBL proteins , 2001, Nucleic Acids Res..

[42]  Tim J. P. Hubbard,et al.  Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..

[43]  S. Balaji,et al.  SUPFAM: A database of sequence superfamilies of protein domains , 2004, BMC Bioinformatics.

[44]  Patricia C Babbitt,et al.  The evolution of function in strictosidine synthase‐like proteins , 2011, Proteins.

[45]  Conrad C. Huang,et al.  Representing Structure-Function Relationships in Mechanistically Diverse Enzyme Superfamilies , 2004, Pacific Symposium on Biocomputing.

[46]  Patricia C. Babbitt,et al.  Prediction of function for the polyprenyl transferase subgroup in the isoprenoid synthase superfamily , 2013, Proceedings of the National Academy of Sciences.

[47]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[48]  Robert D. Finn,et al.  HMMER web server: interactive sequence similarity searching , 2011, Nucleic Acids Res..

[49]  Haruki Nakamura,et al.  PDB 40 : The Protein Data Bank Celebrates its 40 th Birthday , .