Contact replacement for NMR resonance assignment

Motivation: Complementing its traditional role in structural studies of proteins, nuclear magnetic resonance (NMR) spectroscopy is playing an increasingly important role in functional studies. NMR dynamics experiments characterize motions involved in target recognition, ligand binding, etc., while NMR chemical shift perturbation experiments identify and localize protein–protein and protein–ligand interactions. The key bottleneck in these studies is to determine the backbone resonance assignment, which allows spectral peaks to be mapped to specific atoms. This article develops a novel approach to address that bottleneck, exploiting an available X-ray structure or homology model to assign the entire backbone from a set of relatively fast and cheap NMR experiments. Results: We formulate contact replacement for resonance assignment as the problem of computing correspondences between a contact graph representing the structure and an NMR graph representing the data; the NMR graph is a significantly corrupted, ambiguous version of the contact graph. We first show that by combining connectivity and amino acid type information, and exploiting the random structure of the noise, one can provably determine unique correspondences in polynomial time with high probability, even in the presence of significant noise (a constant number of noisy edges per vertex). We then detail an efficient randomized algorithm and show that, over a variety of experimental and synthetic datasets, it is robust to typical levels of structural variation (1–2 AA), noise (250–600%) and missings (10–40%). Our algorithm achieves very good overall assignment accuracy, above 80% in α-helices, 70% in β-sheets and 60% in loop regions. Availability: Our contact replacement algorithm is implemented in platform-independent Python code. The software can be freely obtained for academic use by request from the authors. Contact: gopal@cs.purdue.edu; cbk@cs.dartmouth.edu

[1]  Chris Bailey-Kellogg,et al.  An efficient randomized algorithm for contact-based NMR backbone resonance assignment , 2006, Bioinform..

[2]  Ying Xu,et al.  Protein structure determination using protein threading and sparse NMR data (extended abstract) , 1999, RECOMB '00.

[3]  Sanjeev Khanna,et al.  Approximating Longest Directed Path , 2003, Electron. Colloquium Comput. Complex..

[4]  Zhi-Zhong Chen,et al.  An efficient branch-and-bound algorithm for the assignment of protein backbone NMR peaks , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[5]  Rajeev Motwani,et al.  Finding large cycles in Hamiltonian graphs , 2005, SODA '05.

[6]  Ján Plesník,et al.  The NP-Completeness of the Hamiltonian Cycle Problem in Planar Digraphs with Degree Bound Two , 1979, Inf. Process. Lett..

[7]  Harold N. Gabow,et al.  Finding paths and cycles of superpolylogarithmic length , 2004, STOC '04.

[8]  Kurt Wüthrich,et al.  Sequence-specific NMR assignment of proteins by global fragment mapping with the program Mapper , 2000, Journal of biomolecular NMR.

[9]  Arthur G. Palmer,et al.  Nuclear Magnetic Resonance Studies of Biopolymer Dynamics , 1996 .

[10]  P. Hajduk,et al.  Discovering High-Affinity Ligands for Proteins , 1997, Science.

[11]  Rajeev Motwani,et al.  Approximating the Longest Cycle Problem in Sparse Graphs , 2002, SIAM J. Comput..

[12]  P. Hajduk,et al.  Discovering High-Affinity Ligands for Proteins: SAR by NMR , 1996, Science.

[13]  Gopal Pandurangan,et al.  On a simple randomized algorithm for finding a 2-factor in sparse graphs , 2005, Inf. Process. Lett..

[14]  Chris Bailey-Kellogg,et al.  The NOESY jigsaw: automated protein secondary structure and main-chain assignment from sparse, unassigned NMR data , 2000, RECOMB '00.

[15]  J. Pons,et al.  RESCUE: An artificial neural network tool for the NMR spectral assignment of proteins , 1999, Journal of biomolecular NMR.

[16]  Ton Rullmann,et al.  Completeness of NOEs in protein structures: A statistical analysis of NMR data , 1999 .

[17]  D. Blow,et al.  The detection of sub‐units within the crystallographic asymmetric unit , 1962 .

[18]  Lewis E. Kay,et al.  Protein dynamics from NMR , 1998, Nature Structural Biology.

[19]  Bruce Randall Donald,et al.  An expectation/maximization nuclear vector replacement algorithm for automated NMR resonance assignments , 2004, Journal of biomolecular NMR.

[20]  H N Moseley,et al.  Automated analysis of NMR assignments and structures for proteins. , 1999, Current opinion in structural biology.

[21]  Chris Bailey-Kellogg,et al.  Reconsidering complete search algorithms for protein backbone NMR assignment , 2005, ECCB/JBI.

[22]  M H Saier,et al.  Mapping of the binding interfaces of the proteins of the bacterial phosphotransferase system, HPr and IIAglc. , 1993, Biochemistry.

[23]  David S. Johnson,et al.  The Planar Hamiltonian Circuit Problem is NP-Complete , 1976, SIAM J. Comput..

[24]  Rieko Ishima,et al.  Protein dynamics from NMR , 2000, Nature Structural Biology.

[25]  Gordon S. Rule,et al.  Rapid Protein Structure Detection and Assignment using Residual Dipolar Couplings , 2002 .

[26]  Chris Bailey-Kellogg,et al.  Inferential backbone assignment for sparse data , 2006, Journal of biomolecular NMR.

[27]  G. Montelione,et al.  Automated analysis of protein NMR assignments using methods from artificial intelligence. , 1997, Journal of molecular biology.

[28]  Kurt Wüthrich,et al.  GARANT-a general algorithm for resonance assignment of multidimensional nuclear magnetic resonance spectra , 1997, J. Comput. Chem..

[29]  Thomas Szyperski,et al.  Protein NMR spectroscopy in structural genomics , 2000, Nature Structural Biology.

[30]  Chris Bailey-Kellogg,et al.  A random graph approach to NMR sequential assignment , 2004, J. Comput. Biol..

[31]  M. Zweckstetter,et al.  Mars - robust automatic backbone assignment of proteins , 2004, Journal of biomolecular NMR.

[32]  Anthony K. Yan,et al.  A Polynomial-Time Nuclear Vector Replacement Algorithm for Automated NMR Resonance Assignments , 2004, J. Comput. Biol..

[33]  Primo Pristovek,et al.  Semiautomatic sequence‐specific assignment of proteins based on the tertiary structure—The program st2nmr , 2002, J. Comput. Chem..

[34]  Chris Bailey-Kellogg,et al.  A Hierarchical Grow-and-Match Algorithm for Backbone Resonance Assignments Given 3D Structure , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[35]  Chris Bailey-Kellogg,et al.  Model-Based Assignment and Inference of Protein Backbone Nuclear Magnetic Resonances , 2004, Statistical applications in genetics and molecular biology.

[36]  Béla Bollobás,et al.  Random Graphs , 1985 .