Protein homologous cores and loops: important clues to evolutionary relationships between structurally similar proteins

BackgroundTo discover remote evolutionary relationships and functional similarities between proteins, biologists rely on comparative sequence analysis, and when structures are available, on structural alignments and various measures of structural similarity. The measures/scores that have most commonly been used for this purpose include: alignment length, percent sequence identity, superposition RMSD and their different combinations. More recently, we have introduced the "Homologous core structure overlap score" (HCS) and the "Loop Hausdorff Measure" (LHM). Along with these we also consider the "gapped structural alignment score" (GSAS), which was introduced earlier by other researchers.ResultsWe analyze the performance of these and other conventional measures at the task of ranking structure neighbors by homology, and we show that the HCS, LHM, and GSAS scores display considerably improved performance over the conventional measures of sequence or structural similarity.ConclusionThe HCS, LHM, and GSAS scores are easily computable quantities that allow users of structure-neighbor databases to more easily identify interesting structural similarities between proteins.

[1]  John B. Anderson,et al.  MMDB: Entrez's 3D-structure database , 2002, Nucleic Acids Res..

[2]  K Mizuguchi,et al.  Seeking significance in three-dimensional protein structure comparisons. , 1995, Current opinion in structural biology.

[3]  G J Barton,et al.  Structural features can be unconserved in proteins with similar folds. An analysis of side-chain to side-chain contacts secondary structure and accessibility. , 1994, Journal of molecular biology.

[4]  A C May,et al.  Toward more meaningful hierarchical classification of protein three‐dimensional structures , 1999, Proteins.

[5]  Thomas Madej,et al.  Evolutionary plasticity of protein families: Coupling between sequence and structure variation , 2005, Proteins.

[6]  Benjamin A. Shoemaker,et al.  CDD: a database of conserved domain alignments with links to domain three-dimensional structure , 2002, Nucleic Acids Res..

[7]  K. Nishikawa,et al.  Protein structure comparison using the Markov transition model of evolution , 2000, Proteins.

[8]  Thomas Madej,et al.  Analysis of protein homology by assessing the (dis)similarity in protein loop regions , 2004, Proteins.

[9]  P. Koehl,et al.  Protein structure similarities. , 2001, Current opinion in structural biology.

[10]  Rachel Kolodny,et al.  Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. , 2005, Journal of molecular biology.

[11]  Jean-François Gibrat,et al.  ROC and confusion analysis of structure comparison methods identify the main causes of divergence from manual protein classification , 2006, BMC Bioinform..

[12]  Sung-Hou Kim,et al.  Global mapping of the protein structure space and application in structure-based inference of protein function. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  P. Bork,et al.  Homology among (betaalpha)(8) barrels: implications for the evolution of metabolic pathways. , 2000, Journal of molecular biology.

[14]  B Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. II. On the relationship between sequence and structural similarity for proteins that are not obviously related in sequence. , 2000, Journal of molecular biology.

[15]  Joël Pothier,et al.  YAKUSA: A fast structural database scanning method , 2005, Proteins.

[16]  Liisa Holm,et al.  Automated detection of remote homology. , 2002, Current opinion in structural biology.

[17]  Chris Sander,et al.  Decision Support System for the Evolutionary Classification of Protein Structures , 1997, ISMB.

[18]  Yanli Wang,et al.  MMDB: Entrez's 3D-structure database , 2003, Nucleic Acids Res..

[19]  R B Russell,et al.  Identification of distant homologues of fibroblast growth factors suggests a common ancestor for all beta-trefoil proteins. , 2000, Journal of molecular biology.

[20]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[21]  B Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. I. Protein structural alignment and a quantitative measure for protein structural distance. , 2000, Journal of molecular biology.

[22]  Douglas L. Brutlag,et al.  FoldMiner and LOCK 2: protein structure comparison and motif discovery on the web , 2004, Nucleic Acids Res..

[23]  Liisa Holm,et al.  Identification of homology in protein structure classification , 2001, Nature Structural Biology.

[24]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[25]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[26]  S. Bryant,et al.  Identification of homologous core structures , 1999, Proteins.

[27]  John B. Anderson,et al.  CDD: a curated Entrez database of conserved domain alignments , 2003, Nucleic Acids Res..

[28]  A. Mclachlan Gene duplications in the structural evolution of chymotrypsin. , 1979, Journal of molecular biology.

[29]  W. Pearson,et al.  Sensitivity and selectivity in protein structure comparison , 2004, Protein science : a publication of the Protein Society.