Revisiting the prediction of protein function at CASP6

The ability to predict the function of a protein, given its sequence and/or 3D structure, is an essential requirement for exploiting the wealth of data made available by genomics and structural genomics projects and is therefore raising increasing interest in the computational biology community. To foster developments in the area as well as to establish the state of the art of present methods, a function prediction category was tentatively introduced in the 6th edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP) worldwide experiment. The assessment of the performance of the methods was made difficult by at least two factors: (a) the experimentally determined function of the targets was not available at the time of assessment; (b) the experiment is run blindly, preventing verification of whether the convergence of different predictions towards the same functional annotation was due to the similarity of the methods or to a genuine signal detectable by different methodologies. In this work, we collected information about the methods used by the various predictors and revisited the results of the experiment by verifying how often and in which cases a convergent prediction was obtained by methods based on different rationale. We propose a method for classifying the type and redundancy of the methods. We also analyzed the cases in which a function for the target protein has become available. Our results show that predictions derived from a consensus of different methods can reach an accuracy as high as 80%. It follows that some of the predictions submitted to CASP6, once reanalyzed taking into account the type of converging methods, can provide very useful information to researchers interested in the function of the target proteins.

[1]  M. Huynen,et al.  Prediction of protein function and pathways in the genome era , 2004, Cellular and Molecular Life Sciences CMLS.

[2]  Liisa Holm,et al.  Identification of homology in protein structure classification , 2001, Nature Structural Biology.

[3]  A. Valencia,et al.  Practical limits of function prediction , 2000, Proteins.

[4]  C. Orengo,et al.  Evolution of protein function, from a structural perspective. , 1999, Current opinion in chemical biology.

[5]  Annabel E. Todd,et al.  From structure to function: Approaches and limitations , 2000, Nature Structural Biology.

[6]  Constance J Jeffery,et al.  Multifunctional proteins: examples of gene sharing , 2003, Annals of medicine.

[7]  B. Rost Enzyme function less conserved than anticipated. , 2002, Journal of molecular biology.

[8]  Cathy H. Wu,et al.  InterPro, progress and status in 2005 , 2004, Nucleic Acids Res..

[9]  David T. Jones,et al.  Protein superfamilles and domain superfolds , 1994, Nature.

[10]  Tatiana A. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..

[11]  H. Wolfson,et al.  From structure to function: methods and applications. , 2005, Current protein & peptide science.

[12]  Constance J Jeffery,et al.  Moonlighting proteins: old proteins learning new tricks. , 2003, Trends in genetics : TIG.

[13]  J. Thornton,et al.  Searching for functional sites in protein structures. , 2004, Current opinion in chemical biology.