Automatic modeling of protein backbones in electron-density maps via prediction of Calpha coordinates.

Most crystallographers today solve protein structures by first building as much of the protein backbone as possible and then modeling the side chains. Automating the determination of backbone coordinates by computer-based interpretation of the electron density would enhance the speed and possibly improve the accuracy of the structure-solution process. In this paper, a new computational procedure called CAPRA is described that predicts coordinates of Calpha atoms in density maps and outputs chains of Calpha atoms representing the backbone of the protein. The result constitutes a significant step beyond tracing the density, because there is ideally a one-to-one correspondence between atoms predicted in the chains output by CAPRA and Calpha atoms in the true structure (refined model). CAPRA is based on pattern-recognition techniques, including extraction of rotation-invariant numeric features to represent patterns in the density and use of a neural network to predict which pseudo-atoms in the trace are closest to true Calpha atoms. Experiments with several MAD and MIR electron-density maps of 2.4-2.8 A resolution reveal that CAPRA is capable of building approximately 90% of the backbone of a protein molecule, with an r.m.s. error for Calpha coordinates of around 0.9 A.

[1]  Roderick E. Hubbard,et al.  Analysis of Cα geometry in protein structures , 1994 .

[2]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[3]  James C Sacchettini,et al.  Crystal Structures of Mycolic Acid Cyclopropane Synthases fromMycobacterium tuberculosis * , 2002, The Journal of Biological Chemistry.

[4]  W G Hol,et al.  A database method for automated map interpretation in protein crystallography , 1999, Proteins.

[5]  R. Read Improved Fourier Coefficients for Maps Using Phases from Partial Structures with Errors , 1986 .

[6]  A. Sali,et al.  Structural genomics: beyond the Human Genome Project , 1999, Nature Genetics.

[7]  Dong Yang,et al.  Structure of the Methanococcus jannaschii Mevalonate Kinase, a Member of the GHMP Kinase Superfamily* , 2002, The Journal of Biological Chemistry.

[8]  J. Greer,et al.  Computer skeletonization and automatic electron density map analysis. , 1985, Methods in enzymology.

[9]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[10]  L. Johnson,et al.  Methods in macromolecular crystallography , 2001 .

[11]  C. Sander,et al.  Database algorithm for generating protein backbone and side-chain co-ordinates from a C alpha trace application to model building and detection of co-ordinate errors. , 1991, Journal of molecular biology.

[12]  Axel T. Brunger,et al.  X-PLOR Version 3.1: A System for X-ray Crystallography and NMR , 1992 .

[13]  T A Jones,et al.  Crystallographic studies on a family of cellular lipophilic transport proteins. Refinement of P2 myelin protein and the structure determination and refinement of cellular retinol-binding protein in complex with all-trans-retinol. , 1993, Journal of molecular biology.

[14]  J. Zou,et al.  Improved methods for building protein models in electron density maps and the location of errors in these models. , 1991, Acta crystallographica. Section A, Foundations of crystallography.

[15]  M. Levitt Accurate modeling of protein conformation by automatic segment matching. , 1992, Journal of molecular biology.

[16]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.