mCSM–NA: predicting the effects of mutations on protein–nucleic acids interactions

Abstract Over the past two decades, several computational methods have been proposed to predict how missense mutations can affect protein structure and function, either by altering protein stability or interactions with its partners, shedding light into potential molecular mechanisms giving rise to different phenotypes. Effectively and efficiently predicting consequences of mutations on protein–nucleic acid interactions, however, remained until recently a great and unmet challenge. Here we report an updated webserver for mCSM–NA, the only scalable method we are aware of capable of quantitatively predicting the effects of mutations in protein coding regions on nucleic acid binding affinities. We have significantly enhanced the original method by including a pharmacophore modelling and information of nucleic acid properties into our graph-based signatures, considering the reverse mutation and by using a refined, more reliable data set, based on a new release of the ProNIT database, which has significantly improved the reliability and applicability of the methodology. Our new predictive model was capable of achieving a correlation coefficient of up to 0.70 on cross-validation and 0.68 on blind-tests, outperforming its previous version. The server is freely available via a user-friendly web interface at: http://structure.bioc.cam.ac.uk/mcsm_na.

[1]  Douglas E. V. Pires,et al.  mCSM: predicting the effects of mutations in proteins using graph-based signatures , 2013, Bioinform..

[2]  Douglas E. V. Pires,et al.  DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach , 2014, Nucleic Acids Res..

[3]  Douglas E. V. Pires,et al.  Germline Mutations in the CDKN2B Tumor Suppressor Gene Predispose to Renal Cell Carcinoma. , 2015, Cancer discovery.

[4]  Douglas E. V. Pires,et al.  In silico functional dissection of saturation mutagenesis: Interpreting the relationship between phenotypes and changes in protein stability, interactions and activity , 2016, Scientific Reports.

[5]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[6]  K. Vousden,et al.  p53 mutations in cancer , 2013, Nature Cell Biology.

[7]  Hongtao Yu,et al.  Familial STAG2 germline mutation defines a new human cohesinopathy , 2017, npj Genomic Medicine.

[8]  Ludevit Kadasi,et al.  Twelve novel HGD gene variants identified in 99 alkaptonuria patients: focus on ‘black bone disease’ in Italy , 2015, European Journal of Human Genetics.

[9]  Michael P Weekes,et al.  Ubiquitin-Dependent Modification of Skeletal Muscle by the Parasitic Nematode, Trichinella spiralis , 2016, PLoS pathogens.

[10]  Douglas E. V. Pires,et al.  Analysis of HGD Gene Mutations in Patients with Alkaptonuria from the United Kingdom: Identification of Novel Mutations. , 2015, JIMD reports.

[11]  Douglas E. V. Pires,et al.  mCSM-lig: quantifying the effects of mutations on protein-small molecule affinity in genetic disease and emergence of drug resistance , 2016, Scientific Reports.

[12]  Douglas E. V. Pires,et al.  Platinum: a database of experimentally measured effects of mutations on structurally defined protein–ligand complexes , 2014, Nucleic Acids Res..

[13]  Yongjian Fu,et al.  Data mining , 1997 .

[14]  Douglas E. V. Pires,et al.  SDHA related tumorigenesis: a new case series and literature review for variant interpretation and pathogenicity , 2017, Molecular genetics & genomic medicine.

[15]  Douglas E. V. Pires,et al.  CSM-lig: a web server for assessing and comparing protein–small molecule affinities , 2016, Nucleic Acids Res..

[16]  Bernardo Ochoa-Montaño,et al.  Mutations at protein-protein interfaces: Small changes over big surfaces have large impacts on human health. , 2017, Progress in biophysics and molecular biology.

[17]  Narayanan Eswar,et al.  Protein structure modeling with MODELLER. , 2008, Methods in molecular biology.

[18]  Akinori Sarai,et al.  ProTherm and ProNIT: thermodynamic databases for proteins and protein–nucleic acid interactions , 2005, Nucleic Acids Res..

[19]  A. Bogaerts,et al.  Structure and Function of p53-DNA Complexes with Inactivation and Rescue Mutations: A Molecular Dynamics Simulation Study , 2015, PloS one.

[20]  B. Vogelstein,et al.  p53 mutations in human cancers. , 1991, Science.

[21]  Douglas E. V. Pires,et al.  Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance , 2016, BMC Medicine.

[22]  Alicia P. Higueruelo,et al.  Arpeggio: A Web Server for Calculating and Visualising Interatomic Interactions in Protein Structures , 2017, Journal of molecular biology.

[23]  Douglas E. V. Pires,et al.  The Presence, Persistence and Functional Properties of Plasmodium vivax Duffy Binding Protein II Antibodies Are Influenced by HLA Class II Allelic Variants , 2016, PLoS neglected tropical diseases.

[24]  Niko Välimäki,et al.  CTCF/cohesin-binding sites are frequently mutated in cancer , 2015, Nature Genetics.

[25]  Douglas E. V. Pires,et al.  mCSM-AB: a web server for predicting antibody–antigen affinity changes upon mutation with graph-based signatures , 2016, Nucleic Acids Res..

[26]  D. Ascher,et al.  Variation in Human Cytochrome P-450 Drug-Metabolism Genes: A Gateway to the Understanding of Plasmodium vivax Relapses , 2016, PloS one.

[27]  Tom L. Blundell,et al.  DNA-PKcs structure suggests an allosteric mechanism modulating DNA double-strand break repair , 2017, Science.

[28]  S. Janga,et al.  Dissecting the expression landscape of RNA-binding proteins in human cancers , 2014, Genome Biology.

[29]  Amita Barik,et al.  Probing binding hot spots at protein–RNA recognition sites , 2015, Nucleic acids research.