Generalized comparative modeling (GENECOMP): A combination of sequence comparison, threading, and lattice modeling for protein structure prediction and refinement

An improved generalized comparative modeling method, GENECOMP, for the refinement of threading models is developed and validated on the Fischer database of 68 probe–template pairs, a standard benchmark used to evaluate threading approaches. The basic idea is to perform ab initio folding using a lattice protein model, SICHO, near the template provided by the new threading algorithm PROSPECTOR. PROSPECTOR also provides predicted contacts and secondary structure for the template‐aligned regions, and possibly for the unaligned regions by garnering additional information from other top‐scoring threaded structures. Since the lowest‐energy structure generated by the simulations is not necessarily the best structure, we employed two structure‐selection protocols: distance geometry and clustering. In general, clustering is found to generate somewhat better quality structures in 38 of 68 cases. When applied to the Fischer database, the protocol does no harm and in a significant number of cases improves upon the initial threading model, sometimes dramatically. The procedure is readily automated and can be implemented on a genomic scale. Proteins 2001;44:133–149. © 2001 Wiley‐Liss, Inc.

[1]  J Skolnick,et al.  Functional analysis of the Escherichia coli genome for members of the alpha/beta hydrolase family. , 1998, Folding & design.

[2]  A. Kolinski,et al.  Derivation of protein‐specific pair potentials based on weak sequence fragment similarity , 2000, Proteins.

[3]  David C. Jones,et al.  GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. , 1999, Journal of molecular biology.

[4]  Jacquelyn S. Fetrow,et al.  Functional analysis of the Escherichia coli genome for members of the α /β hydrolase family , 1998 .

[5]  J. Skolnick,et al.  From genes to protein structure and function: novel applications of computational approaches in the genomic era. , 2000, Trends in biotechnology.

[6]  M. Sternberg,et al.  Enhanced genome annotation using structural profiles in the program 3D-PSSM. , 2000, Journal of molecular biology.

[7]  M. Gerstein,et al.  The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. , 1999, Journal of molecular biology.

[8]  J. Skolnick,et al.  Ab initio folding of proteins using restraints derived from evolutionary information , 1999, Proteins.

[9]  R Sánchez,et al.  Evaluation of comparative protein structure modeling by MODELLER‐3 , 1997, Proteins.

[10]  M J Sternberg,et al.  Progress in protein structure prediction: assessment of CASP3. , 1999, Current opinion in structural biology.

[11]  J. Skolnick,et al.  Assembly of protein structure from sparse experimental data: An efficient Monte Carlo model , 1998, Proteins.

[12]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[13]  Narayanan Eswar,et al.  MODBASE, a database of annotated comparative protein structure models , 2002, Nucleic Acids Res..

[14]  A. Liwo,et al.  Calculation of protein conformation by global optimization of a potential energy function , 1999, Proteins.

[15]  A Elofsson,et al.  Assessing the performance of fold recognition methods by means of a comprehensive benchmark. , 1996, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[16]  J Skolnick,et al.  Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity. , 1998, Journal of molecular biology.

[17]  J. Skolnick,et al.  Comparison of three Monte Carlo conformational search strategies for a proteinlike homopolymer model: Folding thermodynamics and identification of low-energy structures , 2000 .

[18]  T. Alwyn Jones,et al.  CASP3 comparative modeling evaluation , 1999, Proteins.

[19]  C. Orengo,et al.  Analysis and assessment of ab initio three‐dimensional prediction, secondary structure, and contacts prediction , 1999, Proteins.

[20]  M Feig,et al.  Accurate reconstruction of all‐atom protein representations from side‐chain‐based low‐resolution models , 2000, Proteins.

[21]  M. Gerstein Patterns of protein‐fold usage in eight microbial genomes: A comprehensive structural census , 1998, Proteins.

[22]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[23]  P Rotkiewicz,et al.  A method for the improvement of threading‐based protein models , 1999, Proteins.

[24]  A. Panchenko,et al.  Combination of threading potentials and sequence profiles improves fold recognition. , 2000, Journal of molecular biology.

[25]  J. Skolnick,et al.  Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases. , 1998, Journal of molecular biology.

[26]  J. Skolnick,et al.  Structure‐based functional motif identifies a potential disulfide oxidoreductase active site in the serine/threonine protein phosphatase‐1 subfamily , 1999, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[27]  R. Fleischmann,et al.  The Minimal Gene Complement of Mycoplasma genitalium , 1995, Science.

[28]  M. Karplus,et al.  Evaluation of comparative protein modeling by MODELLER , 1995, Proteins.

[29]  J Skolnick,et al.  Defrosting the frozen approximation: PROSPECTOR— A new approach to threading , 2001, Proteins.

[30]  Jacquelyn S. Fetrow,et al.  Structural genomics and its importance for gene function analysis , 2000, Nature Biotechnology.

[31]  E. Huang,et al.  Ab initio fold prediction of small helical proteins using distance geometry and knowledge-based scoring functions. , 1999, Journal of molecular biology.

[32]  D Eisenberg,et al.  Inverse protein folding by the residue pair preference profile method: estimating the correctness of alignments of structurally compatible sequences. , 1995, Protein engineering.

[33]  A. Panchenko,et al.  Threading with explicit models for evolutionary conservation of structure and sequence , 1999, Proteins.

[34]  Andrzej Kolinski,et al.  A unified approach to the prediction of protein structure and function , 2002 .

[35]  Wang,et al.  Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.

[36]  S. Bryant Evaluation of threading specificity and accuracy , 1996, Proteins.

[37]  J. Skolnick,et al.  Finding the needle in a haystack: educing native folds from ambiguous ab initio protein structure predictions , 2001 .