Iterative Optimal TM_Score and Z_Score Guided Sampling Significantly Improves Model Topology

Protein structure prediction is the major goal to match up the count of known protein sequences and solved protein structures. Current template based modelling methodologies (TBM) rely on selection of structural folds or evolutionary related templates from already solved experimental structures. Even when the first predicted model conformation from correctly selected template(s) through any of the employed modelling algorithm is correct, the increased model sampling seems ruining the initial model’s topology.Increased sampling is not biased towards the correct nearnative state for a target sequence. Model assessment measures employed during model sampling also pose a huge problem in the reliable selection of the most accurate decoy among the generated ones for a protein sequence. Such persisting model sampling issues are thus carefully studied and streamlined to consistently yield highly accurate models for the majority of protein sequences. A TM and Z score guided sampling algorithm is designed to solve this problem in a logically efficient manner to make the predictions come closer to the actual native conformation for a target sequence. Our sampling methodology yields an average GDT-HA,TM_Score improvement of 4.802 and 0.031 respectively for 21 CASP8 TBM-HA targets (35 Domains), against their best predicted CASP8 models, and thereby our models are found accurate not only for the individual domain(s) but also for the complete overall conformation of the target sequence.

[1]  Ashish Runthala,et al.  Protein Structure Prediction: Are We There Yet? , 2013 .

[2]  John Moult,et al.  A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. , 2005, Current opinion in structural biology.

[3]  Jaap Heringa,et al.  PRALINETM: a strategy for improved multiple alignment of transmembrane proteins , 2008, Bioinform..

[4]  Tuan D Pham,et al.  Knowledge-Based Systems in Biomedicine and Computational Life Science , 2013 .

[5]  Ashish Runthala Protein structure prediction: challenging targets for CASP10 , 2012, Journal of biomolecular structure & dynamics.

[6]  David Baker,et al.  Structure similarity measure with penalty for close non-equivalent residues , 2009, Bioinform..

[7]  A. Sali,et al.  Statistical potentials for fold assessment , 2009 .

[8]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[9]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[10]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[11]  M. Levitt,et al.  Funnel sculpting for in silico assembly of secondary structure elements of proteins , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[13]  A. Sali 100,000 protein structures for the biologist , 1998, Nature Structural Biology.

[14]  Andrzej Kolinski,et al.  TRACER. A new approach to comparative modeling that combines threading with free-space conformational sampling. , 2010, Acta biochimica Polonica.

[15]  Iakes Ezkurdia,et al.  Target domain definition and classification in CASP8 , 2009, Proteins.

[16]  M. Baker,et al.  Refinement of protein structures by iterative comparative modeling and CryoEM density fitting. , 2006, Journal of molecular biology.

[17]  J. P. Mornon,et al.  Incremental threading optimization (TITO) to help alignment and modelling of remote homologues , 1998, Bioinform..

[18]  Arne Elofsson,et al.  MaxSub: an automated measure for the assessment of protein structure prediction quality , 2000, Bioinform..

[19]  A Sali,et al.  Comparative protein modeling by satisfaction of spatial restraints. , 1996, Molecular medicine today.

[20]  Mindaugas Margelevicius,et al.  Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison , 2010, BMC Bioinformatics.

[21]  Hyun Joo,et al.  Fine grained sampling of residue characteristics using molecular dynamics simulation , 2010, Comput. Biol. Chem..

[22]  Oscar Castillo,et al.  Proceedings of the International MultiConference of Engineers and Computer Scientists 2007, IMECS 2007, March 21-23, 2007, Hong Kong, China , 2007, IMECS.

[23]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[24]  A. Sali,et al.  Comparative protein structure modeling by iterative alignment, model building and model assessment. , 2003, Nucleic acids research.

[25]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[26]  Bohn Stafleu van Loghum,et al.  Online … , 2002, LOG IN.

[27]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[28]  K. Katoh,et al.  MAFFT version 5: improvement in accuracy of multiple sequence alignment , 2005, Nucleic acids research.

[29]  J. Onuchic,et al.  Multiple-basin energy landscapes for large-amplitude conformational motions of proteins: Structure-based molecular dynamics simulations , 2006, Proceedings of the National Academy of Sciences.

[30]  David T. Jones,et al.  pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination , 2009, Bioinform..

[31]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..