T-RMSD: a fine-grained, structure-based classification method and its application to the functional characterization of TNF receptors.

This study addresses the relation between structural and functional similarity in proteins. We introduce a novel method named tree based on root mean square deviation (T-RMSD), which uses distance RMSD (dRMSD) variations to build fine-grained structure-based classifications of proteins. The main improvement of the T-RMSD over similar methods, such as Dali, is its capacity to produce the equivalent of a bootstrap value for each cluster node. We validated our approach on two domain families studied extensively for their role in many biological and pathological pathways: the small GTPase RAS superfamily and the cysteine-rich domains (CRDs) associated with the tumor necrosis factor receptors (TNFRs) family. Our analysis showed that T-RMSD is able to automatically recover and refine existing classifications. In the case of the small GTPase ARF subfamily, T-RMSD can distinguish GTP- from GDP-bound states, while in the case of CRDs it can identify two new subgroups associated with well defined functional features (ligand binding and formation of ligand pre-assembly complex). We show how hidden Markov models (HMMs) can be built on these new groups and propose a methodology to use these models simultaneously in order to do fine-grained functional genomic annotation without known 3D structures. T-RMSD, an open source freeware incorporated in the T-Coffee package, is available online.

[1]  Aurélien Grosdidier,et al.  APDB: a novel measure for benchmarking sequence alignment methods without reference alignments , 2003, ISMB.

[2]  J. Naismith,et al.  TNFα and the TNF receptor superfamily: Structure‐function relationship(s) , 2000, Microscopy research and technique.

[3]  M. Mclean,et al.  The Ran decathlon : multiple roles of Ran , 2022 .

[4]  R. Kolodny,et al.  Sequence-similar, structure-dissimilar protein pairs in the PDB , 2007, Proteins.

[5]  K. Kelly The RGK family: a regulatory tail of small GTP-binding proteins. , 2005, Trends in cell biology.

[6]  J. Tschopp,et al.  The molecular architecture of the TNF superfamily. , 2002, Trends in biochemical sciences.

[7]  Christian J. A. Sigrist,et al.  Nucleic Acids Research Advance Access published November 14, 2007 The 20 years of PROSITE , 2007 .

[8]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[9]  S R Sprang,et al.  Modularity in the TNF-receptor family. , 1998, Trends in biochemical sciences.

[10]  S. Munro,et al.  Nomenclature for the human Arf family of GTP-binding proteins: ARF, ARL, and SAR proteins , 2006, The Journal of cell biology.

[11]  D. T. Jones,et al.  The sequence-structure relationship and protein function prediction. , 2009, Current opinion in structural biology.

[12]  F. Balkwill Tumour necrosis factor and cancer , 2009, Nature Reviews Cancer.

[13]  Fabrice Armougom,et al.  The iRMSD: a local measure of sequence alignment accuracy using structural information , 2006, ISMB.

[14]  S R Sprang,et al.  Structures of the extracellular domain of the type I tumor necrosis factor receptor. , 1996, Structure.

[15]  The UniProt Consortium,et al.  The Universal Protein Resource (UniProt) 2009 , 2008, Nucleic Acids Res..

[16]  J. Hancock,et al.  Ras proteins: different signals from different locations , 2003, Nature Reviews Molecular Cell Biology.

[17]  F. Chan The pre-ligand binding assembly domain: a potential target of inhibition of tumour necrosis factor receptor function , 2000, Annals of the rheumatic diseases.

[18]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[19]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[20]  K. Pfeffer,et al.  The intriguing biology of the tumour necrosis factor/tumour necrosis factor receptor superfamily: players, rules and the games , 2005, Immunology.

[21]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[22]  Michael Lappe,et al.  A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3 , 2001, Nucleic Acids Res..

[23]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[24]  William R. Taylor,et al.  A Protein Structure Comparison Methodology , 1996, Comput. Chem..

[25]  D. Stuart,et al.  Structure of CrmE, a virus-encoded tumour necrosis factor receptor. , 2007, Journal of molecular biology.

[26]  R. Locksley,et al.  The TNF and TNF Receptor Superfamilies Integrating Mammalian Biology , 2001, Cell.

[27]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[28]  Minhong Yan,et al.  The crystal structures of EDA-A1 and EDA-A2: splice variants with distinct receptor specificity. , 2003, Structure.

[29]  K. Garcia,et al.  Structure of Nerve Growth Factor Complexed with the Shared Neurotrophin Receptor p75 , 2004, Science.

[30]  D. Wiley,et al.  Herpes simplex virus glycoprotein D bound to the human receptor HveA. , 2001, Molecular cell.

[31]  F. Chan,et al.  Three is better than one: pre-ligand receptor assembly in the regulation of TNF receptor signaling. , 2007, Cytokine.

[32]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[33]  A. Ridley,et al.  Rho GTPases in cancer cell biology , 2008, FEBS letters.

[34]  Sachdev S Sidhu,et al.  Molecular recognition by a binary code. , 2005, Journal of molecular biology.

[35]  Krister Wennerberg,et al.  The Ras superfamily at a glance , 2005, Journal of Cell Science.

[36]  D. Stuart,et al.  Structure of the TRAIL–DR5 complex reveals mechanisms conferring specificity in apoptotic initiation , 1999, Nature Structural Biology.

[37]  Minhong Yan,et al.  Structures of APRIL-Receptor Complexes , 2005, Journal of Biological Chemistry.

[38]  J Schultz,et al.  SMART, a simple modular architecture research tool: identification of signaling domains. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Xia Hong,et al.  Ligand–receptor binding revealed by the TNF family member TALL-1 , 2003, Nature.

[40]  K Nishikawa,et al.  Comparison of homologous tertiary structures of proteins. , 1974, Journal of theoretical biology.

[41]  S. Hymowitz,et al.  The crystal structure of the costimulatory OX40-OX40L complex. , 2006, Structure.

[42]  C. Nelson,et al.  Crystal structure of the TRANCE/RANKL cytokine reveals determinants of receptor-ligand specificity. , 2001, The Journal of clinical investigation.

[43]  P. Güntert,et al.  Solution structure of the cysteine‐rich domain in Fn14, a member of the tumor necrosis factor receptor superfamily , 2009, Protein science : a publication of the Protein Society.

[44]  Cathy H. Wu,et al.  PIRSF Family Classification System for Protein Functional and Evolutionary Analysis , 2006, Evolutionary bioinformatics online.

[45]  Iain M. Wallace,et al.  M-Coffee: combining multiple sequence alignment methods with T-Coffee , 2006, Nucleic acids research.

[46]  H. Stenmark Rab GTPases as coordinators of vesicle traffic , 2009, Nature Reviews Molecular Cell Biology.

[47]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[48]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[49]  Cédric Notredame,et al.  3DCoffee: combining protein sequences and structures within multiple sequence alignments. , 2004, Journal of molecular biology.

[50]  Tim J. P. Hubbard,et al.  SCOP: a Structural Classification of Proteins database , 1999, Nucleic Acids Res..

[51]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[52]  J. Bazan Emerging families of cytokines and receptors , 1993, Current Biology.

[53]  J. Donaldson,et al.  Localization and function of Arf family GTPases. , 2005, Biochemical Society transactions.

[54]  I. Vetter,et al.  The Guanine Nucleotide-Binding Switch in Three Dimensions , 2001, Science.

[55]  Louis Renault,et al.  Arf, Arl, Arp and Sar proteins: a family of GTP‐binding proteins with a structural device for ‘front–back’ communication , 2002, EMBO reports.

[56]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[57]  Gongyi Zhang Tumor necrosis factor family ligand-receptor binding. , 2004, Current opinion in structural biology.

[58]  M. S. Lee,et al.  Crystal Structure of TRAIL-DR5 Complex Identifies a Critical Role of the Unique Frame Insertion in Conferring Recognition Specificity* , 2000, The Journal of Biological Chemistry.

[59]  R M Siegel,et al.  A domain in TNF receptors that mediates ligand-independent receptor assembly and signaling. , 2000, Science.