The Status of Structural Genomics Defined Through the Analysis of Current Targets and Structures

Structural genomics--large-scale macromolecular 3-dimenional structure determination--is unique in that major participants report scientific progress on a weekly basis. The target database (TargetDB) maintained by the Protein Data Bank (http://targetdb.pdb.org) reports this progress through the status of each protein sequence (target) under consideration by the major structural genomics centers worldwide. Hence, TargetDB provides a unique opportunity to analyze the potential impact that this major initiative provides to scientists interested in the sequence-structure-function-disease paradigm. Here we report such an analysis with a focus on: (i) temporal characteristics--how is the project doing and what can we expect in the future? (ii) target characteristics--what are the predicted functions of the proteins targeted by structural genomics and how biased is the target set when compared to the PDB and to predictions across complete genomes? (iii) structures solved--what are the characteristics of structures solved thus far and what do they contribute? The analysis required a more extensive database of structure predictions using different methods integrated with data from other sources. This database, associated tools and related data sources are available from http://spam.sdsc.edu.

[1]  M. Linial,et al.  Estimating the probability for a protein to have a new fold: A statistical computational model. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[2]  A. Godzik,et al.  Comparison of sequence profiles. Strategies for structural predictions using sequence information , 2008, Protein science : a publication of the Protein Society.

[3]  S. Brenner,et al.  Expectations from structural genomics , 2008, Protein science : a publication of the Protein Society.

[4]  Zoubin Ghahramani,et al.  A Bayesian network model for protein fold and remote homologue recognition , 2002, Bioinform..

[5]  R. Service Tapping DNA for Structures Produces a Trickle , 2002, Science.

[6]  M. Gerstein,et al.  Structural Genomics: Current Progress , 2003, Science.

[7]  Sung-Hou Kim,et al.  Overview of structural genomics: from structure to function. , 2003, Current opinion in chemical biology.

[8]  Zukang Feng,et al.  The Protein Data Bank and structural genomics , 2003, Nucleic Acids Res..

[9]  Greg B. Quinn,et al.  A comparative proteomics resource: proteins of Arabidopsis thaliana , 2003, Genome Biology.