A homology/ab initio hybrid algorithm for sampling near‐native protein conformations

One of the major challenges for protein tertiary structure prediction strategies is the quality of conformational sampling algorithms, which can effectively and readily search the protein fold space to generate near‐native conformations. In an effort to advance the field by making the best use of available homology as well as fold recognition approaches along with ab initio folding methods, we have developed Bhageerath‐H Strgen, a homology/ab initio hybrid algorithm for protein conformational sampling. The methodology is tested on the benchmark CASP9 dataset of 116 targets. In 93% of the cases, a structure with TM‐score ≥ 0.5 is generated in the pool of decoys. Further, the performance of Bhageerath‐H Strgen was seen to be efficient in comparison with different decoy generation methods. The algorithm is web enabled as Bhageerath‐H Strgen web tool which is made freely accessible for protein decoy generation (http://www.scfbio‐iitd.res.in/software/Bhageerath‐HStrgen1.jsp). © 2013 Wiley Periodicals, Inc.

[1]  K Yue,et al.  Folding proteins with a simple energy function and extensive conformational searching , 1996, Protein science : a publication of the Protein Society.

[2]  Ram Samudrala,et al.  Improving the accuracy of template-based predictions by mixing and matching between initial models , 2008, BMC Structural Biology.

[3]  B Jayaram,et al.  ProRegIn: A regularity index for the selection of native-like tertiary structures of proteins , 2007, Journal of Biosciences.

[4]  M. Levitt,et al.  Protein decoy assembly using short fragments under geometric constraints , 2003, Biopolymers.

[5]  Bharat Lakhani,et al.  Bhageerath—Targeting the near impossible: Pushing the frontiers of atomic models for protein tertiary structure prediction# , 2012, Journal of Chemical Sciences.

[6]  S. Wodak,et al.  Modelling the polypeptide backbone with 'spare parts' from known protein structures. , 1989, Protein engineering.

[7]  M. Levitt,et al.  Energy functions that discriminate X-ray and near native folds from well-constructed decoys. , 1996, Journal of molecular biology.

[8]  Torsten Schwede,et al.  BIOINFORMATICS Bioinformatics Advance Access published November 12, 2005 The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling , 2022 .

[9]  Yang Zhang,et al.  I-TASSER: a unified platform for automated protein structure and function prediction , 2010, Nature Protocols.

[10]  Jinbo Xu,et al.  Discriminative learning for protein conformation sampling , 2008, Proteins.

[11]  B Jayaram,et al.  A Stoichiometry Driven Universal Spatial Organization of Backbones of Folded Proteins: Are there Chargaff's Rules for Protein Folding? , 2010, Journal of biomolecular structure & dynamics.

[12]  Ambuj K. Singh,et al.  PSI: indexing protein structures for fast similarity search , 2003, ISMB.

[13]  Anna Tramontano,et al.  Sequences and topology: the completeness of biological space , 2007 .

[14]  A C Camproux,et al.  A hidden markov model derived structural alphabet for proteins. , 2004, Journal of molecular biology.

[15]  M J Sippl,et al.  Progress in fold recognition , 1995, Proteins.

[16]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[17]  N. Grishin,et al.  CASP9 target classification , 2011, Proteins.

[18]  Yang Zhang,et al.  The protein structure prediction problem could be solved using the current PDB library. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Raquel Norel,et al.  PUDGE: a flexible, interactive server for protein structure prediction , 2010, Nucleic Acids Res..

[20]  Inbal Budowski-Tal,et al.  FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately , 2010, Proceedings of the National Academy of Sciences.

[21]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[22]  Chris Sander,et al.  Protein folds and families: sequence and structure alignments , 1999, Nucleic Acids Res..

[23]  L. Pauling,et al.  The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Yaoqi Zhou,et al.  Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates , 2011, Bioinform..

[25]  V. Thorsson,et al.  HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. , 2000, Journal of molecular biology.

[26]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[27]  J L Sussman,et al.  A 3D building blocks approach to analyzing and predicting structure of proteins , 1989, Proteins.

[28]  David T. Jones,et al.  pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination , 2009, Bioinform..

[29]  Mindaugas Margelevicius,et al.  Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison , 2010, BMC Bioinformatics.

[30]  Eugene I Shakhnovich,et al.  A knowledge‐based move set for protein folding , 2007, Proteins.

[31]  Rodrigo Lopez,et al.  A new bioinformatics analysis tools framework at EMBL–EBI , 2010, Nucleic Acids Res..

[32]  Richard Bonneau,et al.  Ab initio protein structure prediction: progress and prospects. , 2001, Annual review of biophysics and biomolecular structure.

[33]  S Vajda,et al.  Selecting near‐native conformations in homology modeling: The role of molecular mechanics and solvation terms , 1998, Protein science : a publication of the Protein Society.

[34]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[35]  Anna Tramontano,et al.  Improving your target-template alignment with MODalign , 2012, Bioinform..

[36]  Debashish Sahu,et al.  Bhageerath: an energy based web enabled computer software suite for limiting the search space of tertiary structures of small globular proteins , 2006, Nucleic acids research.

[37]  Holger Gohlke,et al.  The Amber biomolecular simulation programs , 2005, J. Comput. Chem..

[38]  Christoph Weber,et al.  FFAS server: novel features and applications , 2011, Nucleic Acids Res..

[39]  S J Wodak,et al.  Identification of structural domains in proteins by a graph heuristic , 1999, Proteins.

[40]  James E. Fitzgerald,et al.  Mimicking the folding pathway to improve homology-free protein structure prediction , 2009, Proceedings of the National Academy of Sciences.

[41]  Gajendra P.S. Raghava,et al.  PEPstr: a de novo method for tertiary structure prediction of small bioactive peptides. , 2007, Protein and peptide letters.

[42]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[43]  Burkhard Rost,et al.  Using genetic algorithms to select most predictive protein features , 2009, Proteins.

[44]  P E Bourne,et al.  An alternative view of protein fold space , 2000, Proteins.

[45]  Yaoqi Zhou,et al.  Specific interactions for ab initio folding of protein terminal regions with secondary structures , 2008, Proteins.

[46]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[47]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[48]  Sandor Vajda,et al.  Consensus alignment for reliable framework prediction in homology modeling , 2003, Bioinform..

[49]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[50]  Torsten Schwede,et al.  Automated protein structure homology modeling: a progress report. , 2004, Pharmacogenomics.

[51]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[52]  R. Othman,et al.  Computational identification of self‐inhibitory peptides from envelope proteins , 2012, Proteins.

[53]  Liam J. McGuffin,et al.  Improvement of the GenTHREADER Method for Genomic Fold Recognition , 2003, Bioinform..

[54]  M. Karplus,et al.  Evaluation of comparative protein modeling by MODELLER , 1995, Proteins.

[55]  Kam Y. J. Zhang,et al.  A Probabilistic Fragment-Based Protein Structure Prediction Algorithm , 2012, PloS one.

[56]  Thomas Lengauer,et al.  Confidence measures for protein fold recognition , 2002, Bioinform..

[57]  Philip E Bourne,et al.  Structure comparison and alignment. , 2003, Methods of biochemical analysis.

[58]  Yang Zhang,et al.  How significant is a protein structure similarity with TM-score = 0.5? , 2010, Bioinform..

[59]  Nidhi Arora,et al.  Strength of hydrogen bonds in α helices , 1997 .

[60]  Peter A. Kollman,et al.  AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules , 1995 .

[61]  Richard R Copley,et al.  Getting the most from your protein sequence. , 2003, Methods in molecular biology.

[62]  Feng Zhao,et al.  Fragment-free approach to protein folding using conditional neural fields , 2010, Bioinform..

[63]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[64]  Jerry Tsai,et al.  Some fundamental aspects of building protein structures from fragment libraries , 2004, Protein science : a publication of the Protein Society.

[65]  Julian Lee,et al.  Protein structure prediction based on fragment assembly and parameter optimization. , 2005, Biophysical chemistry.

[66]  Charles L. Brooks,et al.  Prediction of protein loop conformations using multiscale modeling methods with physical energy scoring functions , 2008, J. Comput. Chem..

[67]  B Jayaram,et al.  Backbones of folded proteins reveal novel invariant amino acid neighborhoods. , 2011, Journal of biomolecular structure & dynamics.

[68]  Yang Zhang,et al.  Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field , 2012, Proteins.

[69]  Anders Krogh,et al.  Sampling Realistic Protein Conformations Using Local Structural Bias , 2006, PLoS Comput. Biol..

[70]  Ram Samudrala,et al.  LoCo: a novel main chain scoring function for protein structure prediction based on local coordinates , 2011, BMC Bioinformatics.

[71]  Tal Pupko,et al.  ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids , 2010, Nucleic Acids Res..

[72]  M. Levitt,et al.  Small libraries of protein fragments model native protein structures accurately. , 2002, Journal of molecular biology.

[73]  Leszek Rychlewski,et al.  Improving the quality of twilight‐zone alignments , 2000, Protein science : a publication of the Protein Society.

[74]  A. Lesk,et al.  What determines the spectrum of protein native state structures? , 2006, Proteins.

[75]  Yuedong Yang,et al.  Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. , 2009, Structure.

[76]  Yaoqi Zhou,et al.  Ab initio folding of terminal segments with secondary structures reveals the fine difference between two closely related all‐atom statistical energy functions , 2008, Protein science : a publication of the Protein Society.

[77]  T. Schwede,et al.  Protein structure homology modeling using SWISS-MODEL workspace , 2008, Nature Protocols.

[78]  N. Gautham,et al.  Enhanced sampling of the molecular potential energy surface using mutually orthogonal latin squares: application to peptide structures. , 2003, Biophysical journal.

[79]  B. Jayaram and Priyanka Dhingra Towards Creating Complete Proteomic Structural Databases of Whole Organisms , 2012 .

[80]  Liam J McGuffin,et al.  Assembling novel protein folds from super‐secondary structural fragments , 2003, Proteins.

[81]  B Jayaram,et al.  A computational pathway for bracketing native-like structures fo small alpha helical globular proteins. , 2005, Physical chemistry chemical physics : PCCP.

[82]  J. Skolnick,et al.  Ab initio protein structure prediction using chunk-TASSER. , 2007, Biophysical journal.

[83]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[84]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[85]  Mindaugas Margelevicius,et al.  COMA server for protein distant homology search , 2010, Bioinform..

[86]  Solvation thermodynamics of amino acids Assessment of the electrostatic contribution and force-field dependence , 1997 .

[87]  Leszek Rychlewski,et al.  FFAS03: a server for profile–profile sequence alignments , 2005, Nucleic Acids Res..

[88]  Shoji Takada,et al.  A Reversible Fragment Assembly Method for De Novo Protein Structure Prediction , 2003 .

[89]  L. Pauling,et al.  The pleated sheet, a new layer configuration of polypeptide chains. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[90]  Liam J. McGuffin,et al.  The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction , 2011, Nucleic Acids Res..

[91]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[92]  Vincent B. Chen,et al.  Correspondence e-mail: , 2000 .

[93]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..