PconsC4: fast, accurate and hassle-free contact predictions

Motivation Residue contact prediction was revolutionized recently by the introduction of direct coupling analysis (DCA). Further improvements, in particular for small families, have been obtained by the combination of DCA and deep learning methods. However, existing deep learning contact prediction methods often rely on a number of external programs and are therefore computationally expensive. Results Here, we introduce a novel contact predictor, PconsC4, which performs on par with state of the art methods. PconsC4 is heavily optimized, does not use any external programs and therefore is significantly faster and easier to use than other methods. Availability PconsC4 is freely available under the GPL license from https://github.com/ElofssonLab/PconsC4. Installation is easy using the pip command and works on any system with Python 3.5 or later and a GCC compiler. It does not require a GPU nor special hardware. Supplementary information All data used in the development is available at Bioinformatics online.

[1]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[2]  Marcin J. Skwark,et al.  Improved Contact Predictions Using the Recognition of Protein Like Contact Patterns , 2014, PLoS Comput. Biol..

[3]  Mehdi Amini,et al.  Pythran: Enabling Static Optimization of Scientific Python Programs , 2013, SciPy.

[4]  Georgios A. Pavlopoulos,et al.  Protein structure determination using metagenome sequence data , 2017, Science.

[5]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[6]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[7]  David T Jones,et al.  Improved protein contact predictions with the MetaPSICOV2 server in CASP12 , 2018, Proteins.

[8]  T. Hwa,et al.  Identification of direct residue contacts in protein–protein interaction by message passing , 2009, Proceedings of the National Academy of Sciences.

[9]  Mirco Michel,et al.  Large-scale structure prediction by improved contact predictions and model quality assessment , 2017, bioRxiv.

[10]  Zhen Li,et al.  Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model , 2016, bioRxiv.

[11]  Thomas A. Hopf,et al.  Protein 3D Structure Computed from Evolutionary Sequence Variation , 2011, PloS one.

[12]  Arne Elofsson,et al.  Deep transfer learning in the assessment of the quality of protein models , 2018, 1804.06281.

[13]  Marcin J. Skwark,et al.  Predicting accurate contacts in thousands of Pfam domain families using PconsC3 , 2017, Bioinform..

[14]  Alejandro F. Frangi,et al.  Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015 , 2015, Lecture Notes in Computer Science.

[15]  Jie Hou,et al.  DNCON2: improved protein contact prediction using two-level deep convolutional neural networks , 2017, bioRxiv.

[16]  H. Wildiers,et al.  Circulating MicroRNAs as Easy-to-Measure Aging Biomarkers in Older Breast Cancer Patients: Correlation with Chronological Age but Not with Fitness/Frailty Status , 2014, PloS one.

[17]  Carlo Baldassi,et al.  Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners , 2014, PloS one.