BioShell-Threading: versatile Monte Carlo package for protein 3D threading

BackgroundThe comparative modeling approach to protein structure prediction inherently relies on a template structure. Before building a model such a template protein has to be found and aligned with the query sequence. Any error made on this stage may dramatically affects the quality of result. There is a need, therefore, to develop accurate and sensitive alignment protocols.ResultsBioShell threading software is a versatile tool for aligning protein structures, protein sequences or sequence profiles and query sequences to a template structures. The software is also capable of sub-optimal alignment generation. It can be executed as an application from the UNIX command line, or as a set of Java classes called from a script or a Java application. The implemented Monte Carlo search engine greatly facilitates the development and benchmarking of new alignment scoring schemes even when the functions exhibit non-deterministic polynomial-time complexity.ConclusionsNumerical experiments indicate that the new threading application offers template detection abilities and provides much better alignments than other methods. The package along with documentation and examples is available at: http://bioshell.pl/threading3d.

[1]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[2]  Jianlin Cheng,et al.  NNcon: improved protein contact map prediction using 2D-recursive neural networks , 2009, Nucleic Acids Res..

[3]  A. Godzik,et al.  Derivation and testing of pair potentials for protein folding. When is the quasichemical approximation correct? , 1997, Protein science : a publication of the Protein Society.

[4]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[5]  Dominik Gront,et al.  BioShell Threader: protein homology detection based on sequence profiles and secondary structure profiles , 2012, Nucleic Acids Res..

[6]  N. Go,et al.  Studies on protein folding, unfolding and fluctuations by computer simulation. I. The effect of specific amino acid sequence represented by specific inter-unit interactions. , 2009 .

[7]  Jeffrey Skolnick,et al.  Fr-TM-align: a new protein structural alignment method based on fragment alignments and the TM-score , 2008, BMC Bioinformatics.

[8]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[9]  A. Sali,et al.  Alignment of protein sequences by their profiles , 2004, Protein science : a publication of the Protein Society.

[10]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[11]  David T. Jones,et al.  pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination , 2009, Bioinform..

[12]  Michael Farrar,et al.  Sequence analysis Striped Smith – Waterman speeds database searches six times over other SIMD implementations , 2007 .

[13]  Wang,et al.  Nonuniversal critical dynamics in Monte Carlo simulations. , 1987, Physical review letters.

[14]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[15]  M J Sippl,et al.  Structure-based evaluation of sequence comparison and fold recognition alignment accuracy. , 2000, Journal of molecular biology.

[16]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[17]  N. Go,et al.  Studies on protein folding, unfolding and fluctuations by computer simulation. III. Effect of short-range interactions. , 2009, International journal of peptide and protein research.

[18]  A. Godzik,et al.  Are proteins ideal mixtures of amino acids? Analysis of energy parameter sets , 1995, Protein science : a publication of the Protein Society.

[19]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[20]  S. Pietrokovski,et al.  A pair‐to‐pair amino acids substitution matrix and its applications for protein structure prediction , 2007, Proteins.

[21]  M. Troyer,et al.  Optimized parallel tempering simulations of proteins. , 2006, The Journal of chemical physics.

[22]  Dominik Gront,et al.  Utility library for structural bioinformatics , 2008, Bioinform..

[23]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[24]  Daisuke Kihara,et al.  Effect of using suboptimal alignments in template‐based protein structure prediction , 2011, Proteins.

[25]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[26]  M. Sippl,et al.  ProSup: a refined tool for protein structure alignment. , 2000, Protein engineering.

[27]  L. Mirny,et al.  Protein structure prediction by threading. Why it works and why it does not. , 1998, Journal of molecular biology.

[28]  Andrzej Kolinski,et al.  TRACER. A new approach to comparative modeling that combines threading with free-space conformational sampling. , 2010, Acta biochimica Polonica.

[29]  Dominik Gront,et al.  Optimization of Profile-to-Profile Alignment Parameters for One-Dimensional Threading , 2012, J. Comput. Biol..

[30]  R L Jernigan,et al.  Identifying sequence-structure pairs undetected by sequence alignments. , 2000, Protein engineering.

[31]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.

[32]  Ying Xu,et al.  Raptor: Optimal Protein Threading by Linear Programming , 2003, J. Bioinform. Comput. Biol..

[33]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[34]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[35]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[36]  Jian Peng,et al.  Template-based protein structure modeling using the RaptorX web server , 2012, Nature Protocols.

[37]  J R Banavar,et al.  Protein threading by learning , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[38]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[39]  A. Kolinski,et al.  Derivation of protein‐specific pair potentials based on weak sequence fragment similarity , 2000, Proteins.

[40]  Dominik Gront,et al.  Efficient scheme for optimization of parallel tempering Monte Carlo method , 2007 .

[41]  Hongyi Zhou,et al.  Fold recognition by combining sequence profiles derived from evolution and from depth‐dependent structural alignment of fragments , 2004, Proteins.

[42]  Golan Yona,et al.  Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. , 2002, Journal of molecular biology.

[43]  Nick V. Grishin,et al.  Probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments , 2003, Bioinform..

[44]  Andrzej Kolinski,et al.  Designing an Automatic Pipeline for Protein Structure Prediction Designing an Automatic Pipeline for Protein Structure Prediction , 2008 .

[45]  Dominik Gront,et al.  BioShell - a package of tools for structural biology computations , 2006, Bioinform..

[46]  Roland L Dunbrack,et al.  Scoring profile‐to‐profile sequence alignments , 2004, Protein science : a publication of the Protein Society.

[47]  N. Grishin,et al.  MALIDUP: A database of manually constructed structure alignments for duplicated domain pairs , 2007, Proteins.

[48]  A. Kolinski Protein modeling and structure prediction with a reduced representation. , 2004, Acta biochimica Polonica.

[49]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[50]  E. Domany,et al.  Pairwise contact potentials are unsuitable for protein folding , 1998 .