CONFOLD2: improved contact-driven ab initio protein structure modeling

Background Contact-guided protein structure prediction methods are becoming more and more successful because of the latest advances in residue-residue contact prediction. To support the contact-driven structure prediction, effective tools that can quickly build tertiary structural models of good quality from predicted contacts need to be developed. Results We develop an improved contact-driven protein modeling method, CONFOLD2, and study how it may be effectively used for ab initio protein structure prediction with predicted contacts as input. It builds models using various subsets of input contacts to explore the fold space under the guidance of a soft square energy function, and then clusters the models to obtain top five models. CONFOLD2 is benchmarked on various datasets including CASP11 and 12 datasets with publicly available predicted contacts and yields better performance than the popular CONFOLD method. Conclusion CONFOLD2 allows to quickly generate top five structural models for a protein sequence, when its secondary structures and contacts predictions at hand. CONFOLD2 is publicly available at https://github.com/multicom-toolbox/CONFOLD2/.

[1]  Jinbo Xu,et al.  Analysis of deep learning methods for blind protein contact prediction in CASP12 , 2018, Proteins.

[2]  A. Tramontano,et al.  New encouraging developments in contact prediction: Assessment of the CASP11 results , 2016, Proteins.

[3]  David T. Jones,et al.  De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts , 2014, PloS one.

[4]  David T. Jones,et al.  Accurate contact predictions using covariation techniques and machine learning , 2015, Proteins.

[5]  A. Gronenborn,et al.  Determination of three-dimensional structures of proteins by simulated annealing with interproton distance restraints. Application to crambin, potato carboxypeptidase inhibitor and barley serine proteinase inhibitor 2. , 1988, Protein engineering.

[6]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[7]  Jie Hou,et al.  ConEVA: a toolbox for comprehensive assessment of protein contacts , 2016, BMC Bioinformatics.

[8]  Oliver Brock,et al.  Analysis of free modeling predictions by RBO aleph in CASP11 , 2016, Proteins.

[9]  Arne Elofsson,et al.  MaxSub: an automated measure for the assessment of protein structure prediction quality , 2000, Bioinform..

[10]  Marcin J. Skwark,et al.  PconsFold: improved contact predictions improve protein models , 2014, Bioinform..

[11]  Mirco Michel,et al.  Large-scale structure prediction by improved contact predictions and model quality assessment , 2017, bioRxiv.

[12]  Jianlin Cheng,et al.  CONFOLD: Residue‐residue contact‐guided ab initio protein folding , 2015, Proteins.

[13]  Massimiliano Pontil,et al.  PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments , 2012, Bioinform..

[14]  David T. Jones,et al.  MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins , 2014, Bioinform..

[15]  Zhen Li,et al.  Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model , 2016, bioRxiv.