CAFASP2: The second critical assessment of fully automated structure prediction methods

The results of the second Critical Assessment of Fully Automated Structure Prediction (CAFASP2) are presented. The goals of CAFASP are to (i) assess the performance of fully automatic web servers for structure prediction, by using the same blind prediction targets as those used at CASP4, (ii) inform the community of users about the capabilities of the servers, (iii) allow human groups participating in CASP to use and analyze the results of the servers while preparing their nonautomated predictions for CASP, and (iv) compare the performance of the automated servers to that of the human‐expert groups of CASP. More than 30 servers from around the world participated in CAFASP2, covering all categories of structure prediction. The category with the largest participation was fold recognition, where 24 CAFASP servers filed predictions along with 103 other CASP human groups. The CAFASP evaluation indicated that it is difficult to establish an exact ranking of the servers because the number of prediction targets was relatively small and the differences among many servers were also small. However, roughly a group of five “best” fold recognition servers could be identified. The CASP evaluation identified the same group of top servers albeit with a slightly different relative order. Both evaluations ranked a semiautomated method named CAFASP‐CONSENSUS, that filed predictions using the CAFASP results of the servers, above any of the individual servers. Although the predictions of the CAFASP servers were available to human CASP predictors before the CASP submission deadline, the CASP assessment identified only 11 human groups that performed better than the best server. Furthermore, about one fourth of the top 30 performing groups corresponded to automated servers. At least half of the top 11 groups corresponded to human groups that also had a server in CAFASP or to human groups that used the CAFASP results to prepare their predictions. In particular, the CAFASP‐CONSENSUS group was ranked 7. This shows that the automated predictions of the servers can be very helpful to human predictors. We conclude that as servers continue to improve, they will become increasingly important in any prediction process, especially when dealing with genome‐scale prediction tasks. We expect that in the near future, the performance difference between humans and machines will continue to narrow and that fully automated structure prediction will become an effective companion and complement to experimental structural genomics. Proteins 2001;Suppl 5:171–183. © 2002 Wiley‐Liss, Inc.

[1]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[2]  Arne Elofsson,et al.  A study of quality measures for protein threading models , 2001, BMC Bioinformatics.

[3]  R. Abagyan,et al.  Do aligned sequences share the same fold? , 1997, Journal of molecular biology.

[4]  M. Sternberg,et al.  Analysis of the relationship between side-chain conformation and secondary structure in globular proteins. , 1987, Journal of molecular biology.

[5]  D. Baker,et al.  Prediction of local structure in proteins using a library of sequence-structure motifs. , 1998, Journal of molecular biology.

[6]  A. Valencia,et al.  Improving contact predictions by the combination of correlated mutations and other sources of sequence information. , 1997, Folding & design.

[7]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[8]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[9]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[10]  T L Blundell,et al.  FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. , 2001, Journal of molecular biology.

[11]  D Fischer,et al.  The 2000 Olympic Games of protein structure prediction; fully automated programs are being evaluated vis-à-vis human teams in the protein structure prediction experiment CAFASP2. , 2000, Protein engineering.

[12]  J Meller,et al.  Linear programming optimization and a double statistical filter for protein threading protocols , 2001, Proteins.

[13]  John P. Overington,et al.  A structural basis for sequence comparisons. An evaluation of scoring methodologies. , 1993, Journal of molecular biology.

[14]  Marc A. Martí-Renom,et al.  EVA: continuous automatic evaluation of protein structure prediction servers , 2001, Bioinform..

[15]  Michael J. E. Sternberg,et al.  Recognition of remote protein homologies using three-dimensional information to generate a position specific scoring matrix in the program 3D-PSSM , 1999, RECOMB.

[16]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[17]  Arne Elofsson,et al.  Structure prediction meta server , 2001, Bioinform..

[18]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[19]  D Fischer,et al.  Predicting structures for genome proteins. , 1999, Current opinion in structural biology.

[20]  John P. Overington,et al.  HOMSTRAD: A database of protein structure alignments for homologous families , 1998, Protein science : a publication of the Protein Society.

[21]  C. Sander,et al.  Are binding residues conserved? , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[22]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[23]  Michael J. E. Sternberg,et al.  SAWTED: Structure Assignment With Text Description-Enhanced detection of remote homologues with automated SWISS-PROT annotation comparisons , 2000, Bioinform..

[24]  D Fischer,et al.  Hybrid fold recognition: combining sequence derived properties with evolutionary information. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[25]  H Umeyama,et al.  Prediction of protein side-chain conformations by principal component analysis for fixed main-chain atoms. , 1997, Protein engineering.

[26]  H. Umeyama,et al.  An automatic homology modeling method consisting of database searches and simulated annealing. , 2000, Journal of molecular graphics & modelling.

[27]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[28]  D Fischer,et al.  LiveBench‐2: Large‐scale automated evaluation of protein structure prediction servers , 2001, Proteins.

[29]  M J Sternberg,et al.  Model building by comparison at CASP3: Using expert knowledge and computer automation , 1999, Proteins.

[30]  D. Fischer,et al.  Protein fold recognition using sequence‐derived predictions , 1996, Protein science : a publication of the Protein Society.

[31]  M. Sternberg,et al.  Enhanced genome annotation using structural profiles in the program 3D-PSSM. , 2000, Journal of molecular biology.

[32]  D. Fischer,et al.  Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[33]  David C. Jones,et al.  GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. , 1999, Journal of molecular biology.

[34]  L. Chew,et al.  Unit‐vector RMS (URMS) as a tool to analyze molecular dynamics trajectories , 1999, Proteins.

[35]  Alison Abbott,et al.  Computer modellers seek out 'Ten Most Wanted' proteins , 2001, Nature.

[36]  David Baker,et al.  We need both computer models and experiments , 2001, Nature.

[37]  Roland L. Dunbrack,et al.  Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. , 1997, Journal of molecular biology.

[38]  D Fischer,et al.  LiveBench‐1: Continuous benchmarking of protein structure prediction servers , 2001, Protein science : a publication of the Protein Society.

[39]  Alfonso Valencia,et al.  A graphical interface for correlated mutations and other protein structure prediction methods , 1997, Comput. Appl. Biosci..

[40]  Arne Elofsson,et al.  MaxSub: an automated measure for the assessment of protein structure prediction quality , 2000, Bioinform..

[41]  A. Godzik,et al.  Comparison of sequence profiles. Strategies for structural predictions using sequence information , 2008, Protein science : a publication of the Protein Society.

[42]  R. Casadio,et al.  A neural network based predictor of residue contacts in proteins. , 1999, Protein engineering.

[43]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.