Using scores derived from statistical coupling analysis to distinguish correct and incorrect folds in de‐novo protein structure prediction

Distinguishing native from non‐native folds remains a challenging problem for protein structure prediction. We describe a method, SCA‐distance scoring, based on results from statistical coupling analysis which discriminates between native and non‐native folds produced by a de novo protein structure prediction method for four out of five test proteins. The method is particularly good at discriminating non‐native folds which are close in RMSD to the true fold but contain a change in an internal structural element. SCA‐distance scoring is a useful addition to the tools available for distinguishing native from non‐native folds in protein structure prediction. Proteins 2008. © 2007 Wiley‐Liss, Inc.

[1]  Kuang Lin,et al.  Threading Using Neural nEtwork (TUNE): the measure of protein sequence-structure compatibility , 2002, Bioinform..

[2]  W. P. Russ,et al.  Evolutionary information for specifying a protein fold , 2005, Nature.

[3]  A. Valencia,et al.  In silico two‐hybrid system for the selection of physically interacting protein pairs , 2002, Proteins.

[4]  William R. Taylor,et al.  Sequence Conservation and Correlation Measures in Protein Structure Prediction , 1994, Comput. Chem..

[5]  W R Taylor,et al.  Protein fold refinement: building models from idealized folds using motif constraints and multiple sequence data. , 1993, Protein engineering.

[6]  W. Taylor,et al.  Effectiveness of correlation analysis in identifying protein residues undergoing correlated evolution. , 1997, Protein engineering.

[7]  R A Sayle,et al.  RASMOL: biomolecular graphics for all. , 1995, Trends in biochemical sciences.

[8]  D. Thirumalai,et al.  Determination of network of residues that regulate allostery in protein families using sequence analysis , 2006, Protein science : a publication of the Protein Society.

[9]  A. Horovitz,et al.  Detection and reduction of evolutionary noise in correlated mutation analysis. , 2005, Protein engineering, design & selection : PEDS.

[10]  G. Vriend,et al.  Prediction of protein residue contacts with a PDB-derived likelihood matrix. , 2002, Protein engineering.

[11]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[12]  William R. Taylor,et al.  Protein model refinement using structural fragment tessellation , 2006, Comput. Biol. Chem..

[13]  R. Ranganathan,et al.  Evolutionarily conserved pathways of energetic connectivity in protein families. , 1999, Science.

[14]  William R Taylor,et al.  Protein Structure Comparison Using Bipartite Graph Matching and Its Application to Protein Structure Classification * , 2002, Molecular & Cellular Proteomics.

[15]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[16]  C. Sander,et al.  The prediction of protein contacts from multiple sequence alignments. , 1996, Protein engineering.

[17]  H. Wolfson,et al.  Correlated mutations: Advances and limitations. A study on fusion proteins and on the Cohesin‐Dockerin families , 2006, Proteins.

[18]  Peng Chen,et al.  Predicting protein interaction sites from residue spatial sequence profile and evolution rate , 2006, FEBS Letters.

[19]  Kuang Lin,et al.  Testing homology with Contact Accepted mutatiOn (CAO): a contact-based Markov model of protein evolution , 2003, Comput. Biol. Chem..

[20]  J. Skolnick,et al.  Ab initio folding of proteins using restraints derived from evolutionary information , 1999, Proteins.

[21]  C. Sander,et al.  Correlated Mutations and Residue Contacts , 1994 .

[22]  William R Taylor,et al.  Prediction of protein structure from ideal forms , 2008, Proteins.

[23]  William R Taylor,et al.  Dynamic domain threading , 2006, Proteins.

[24]  W R Taylor,et al.  Coevolving protein residues: maximum likelihood identification and relationship to structure. , 1999, Journal of molecular biology.

[25]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[26]  C. Sander,et al.  Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? , 1994, Protein engineering.

[27]  W. Taylor,et al.  Global fold determination from a small number of distance restraints. , 1995, Journal of molecular biology.

[28]  Sarel J. Fleishman,et al.  An evolutionarily conserved network of amino acids mediates gating in voltage-dependent potassium channels. , 2004 .

[29]  William R Taylor,et al.  A structural pattern‐based method for protein fold recognition , 2004, Proteins.

[30]  E. Neher How frequent are correlated changes in families of protein sequences? , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[31]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[32]  B. Rost,et al.  Effective use of sequence correlation and conservation in fold recognition. , 1999, Journal of molecular biology.