Investigating the Molecular Mechanisms Behind Uncharacterized Cysteine Losses from Prediction of Their Oxidation State

Cysteines are among the rarest amino acids in nature, and are both functionally and structurally very important for proteins. The ability of cysteines to form disulfide bonds is especially relevant, both for constraining the folded state of the protein and for performing enzymatic duties. But how does the variation record of human proteins reflect their functional importance and structural role, especially with regard to deleterious mutations? We created HUMCYS, a manually curated dataset of single amino acid variants that (1) have a known disease/neutral phenotypic outcome and (2) cause the loss of a cysteine, in order to investigate how mutated cysteines relate to structural aspects such as surface accessibility and cysteine oxidation state. We also have developed a sequence‐based in silico cysteine oxidation predictor to overcome the scarcity of experimentally derived oxidation annotations, and applied it to extend our analysis to classes of proteins for which the experimental determination of their structure is technically challenging, such as transmembrane proteins. Our investigation shows that we can gain insights into the reason behind the outcome of cysteine losses in otherwise uncharacterized proteins, and we discuss the possible molecular mechanisms leading to deleterious phenotypes, such as the involvement of the mutated cysteine in a structurally or enzymatically relevant disulfide bond.

[1]  R. Evans,et al.  Zinc fingers: Gilt by association , 1988, Cell.

[2]  B. Matthews,et al.  Substantial increase of protein stability by multiple disulphide bonds , 1989, Nature.

[3]  D Eisenberg,et al.  Where metal ions bind in proteins. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[4]  A. Böck,et al.  Selenoprotein synthesis: an expansion of the genetic code. , 1991, Trends in biochemical sciences.

[5]  M. Hayden,et al.  Two naturally occurring mutations at the first and second bases of codon aspartic acid 156 in the proposed catalytic triad of human lipoprotein lipase. In vivo evidence that aspartic acid 156 is essential for catalysis. , 1992, The Journal of biological chemistry.

[6]  C. Pace,et al.  A helix propensity scale based on experimental studies of peptides and proteins. , 1998, Biophysical journal.

[7]  C. C. Chang,et al.  Cysteine contributions to metal binding preference for Zn/Cd in the beta-domain of metallothionein. , 1998, Protein engineering.

[8]  E. Traboulsi,et al.  Mutational analysis and clinical correlation in Leber congenital amaurosis , 2000, Ophthalmic genetics.

[9]  H. Scheraga,et al.  Disulfide bonds and protein folding. , 2000, Biochemistry.

[10]  D. Weaver,et al.  Characterization of aromatic-thiol π-type hydrogen bonding and phenylalanine-cysteine side chain interactions through ab initio calculations and protein database analyses , 2001 .

[11]  Pavel V Baranov,et al.  Recoding: translational bifurcations in gene expression. , 2002, Gene.

[12]  Gail J. Bartlett,et al.  Analysis of catalytic residues in enzyme active sites. , 2002, Journal of molecular biology.

[13]  Piero Fariselli,et al.  Prediction of the disulfide‐bonding state of cysteines in proteins at 88% accuracy , 2002, Protein science : a publication of the Protein Society.

[14]  X. Estivill,et al.  A common frameshift mutation and other variants in GJB4 (connexin 30.3): Analysis of hearing impairment families , 2002, Human mutation.

[15]  Jon Beckwith,et al.  Protein disulfide bond formation in prokaryotes. , 2003, Annual review of biochemistry.

[16]  P. Hogg,et al.  Disulfide bonds as switches for protein function. , 2003, Trends in biochemical sciences.

[17]  G. Poli,et al.  Oxidative stress and cell signalling. , 2004, Current medicinal chemistry.

[18]  Paolo Frasconi,et al.  Disulfide connectivity prediction using recursive neural networks and evolutionary information , 2004, Bioinform..

[19]  D. Quaglino,et al.  ABCC6 mutations in Italian families affected by pseudoxanthoma elasticum (PXE) , 2004, Human mutation.

[20]  B. Schmidt,et al.  Allosteric disulfide bonds. , 2006, Biochemistry.

[21]  J. W. Wong,et al.  Allosteric Disulfide Bonds , 2006 .

[22]  Alessio Ceroni,et al.  DISULFIND: a disulfide bonding state and cysteine connectivity prediction server , 2006, Nucleic Acids Res..

[23]  Pierre Baldi,et al.  Large‐scale prediction of disulphide bridges using kernel methods, two‐dimensional recursive neural networks, and weighted graph matching , 2005, Proteins.

[24]  J. Collet,et al.  Pathways of disulfide bond formation in Escherichia coli. , 2006, The international journal of biochemistry & cell biology.

[25]  Douda Bensasson,et al.  Transition-Transversion Bias Is Not Universal: A Counter Example from Grasshopper Pseudogenes , 2007, PLoS genetics.

[26]  M. W. Anders,et al.  Glutathione transferase omega 1 catalyzes the reduction of S-(phenacyl)glutathiones to acetophenones. , 2007, Chemical research in toxicology.

[27]  Byungkook Lee,et al.  Mesothelin, Stereocilin, and Otoancorin are predicted to have superhelical structures with ARM-type repeats , 2009, BMC Structural Biology.

[28]  Rahul Singh,et al.  A review of algorithmic techniques for disulfide-bond determination. , 2008, Briefings in functional genomics & proteomics.

[29]  András Fiser,et al.  Predicting disulfide bond connectivity in proteins by correlated mutations analysis , 2008, Bioinform..

[30]  T. Petersen,et al.  A generic method for assignment of reliability scores applied to solvent accessibility predictions , 2009, BMC Structural Biology.

[31]  Alan F. Scott,et al.  McKusick's Online Mendelian Inheritance in Man (OMIM®) , 2008, Nucleic Acids Res..

[32]  Jenny J. Yang,et al.  A cysteine-rich metal-binding domain from rubella virus non-structural protein is essential for viral protease activity and virus replication. , 2009, The Biochemical journal.

[33]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[34]  K. Inaba,et al.  MBSJ MCC Young Scientist Award 2009
REVIEW: Structural basis of protein disulfide bond generation in the cell , 2010, Genes to cells : devoted to molecular & cellular mechanisms.

[35]  Jana Marie Schwarz,et al.  MutationTaster evaluates disease-causing potential of sequence alterations , 2010, Nature Methods.

[36]  Baris E. Suzek,et al.  The Universal Protein Resource (UniProt) in 2010 , 2009, Nucleic Acids Res..

[37]  P. Sieving,et al.  A mutation in ZNF513, a putative regulator of photoreceptor development, causes autosomal-recessive retinitis pigmentosa. , 2010, American journal of human genetics.

[38]  F. Cremers,et al.  Overview of the mutation spectrum in familial exudative vitreoretinopathy and Norrie disease with identification of 21 novel variants in FZD4, LRP5, and NDP , 2010, Human mutation.

[39]  Gunnar Rätsch,et al.  The SHOGUN Machine Learning Toolbox , 2010, J. Mach. Learn. Res..

[40]  Piero Fariselli,et al.  Improving the prediction of disulfide bonds in Eukaryotes with machine learning methods and protein subcellular localization , 2011, Bioinform..

[41]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[42]  C. Sander,et al.  Predicting the functional impact of protein mutations: application to cancer genomics , 2011, Nucleic acids research.

[43]  V. Gladyshev,et al.  Analysis and Functional Prediction of Reactive Cysteine Residues* , 2011, The Journal of Biological Chemistry.

[44]  M. Vihinen,et al.  Performance of mutation pathogenicity prediction methods on missense variants , 2011, Human mutation.

[45]  J. Collet,et al.  How proteins form disulfide bonds. , 2011, Antioxidants & redox signaling.

[46]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[47]  J. Miller,et al.  Predicting the Functional Effect of Amino Acid Substitutions and Indels , 2012, PloS one.

[48]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[49]  B. Keavney,et al.  Nonsynonymous variants in the SMAD6 gene predispose to congenital cardiovascular malformation , 2012, Human mutation.

[50]  Tom R. Gaunt,et al.  Predicting the Functional, Molecular, and Phenotypic Consequences of Amino Acid Substitutions using Hidden Markov Models , 2012, Human mutation.

[51]  Janet M. Thornton,et al.  Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset , 2013, PLoS Comput. Biol..

[52]  Piero Fariselli,et al.  Prediction of disulfide connectivity in proteins with machine-learning methods and correlated mutations , 2013, BMC Bioinformatics.

[53]  Daniel C. Liebler,et al.  Site-specific mapping and quantification of protein S-sulfenylation in cells , 2014, Nature Communications.

[54]  Daniele Raimondi,et al.  An Evolutionary View on Disulfide Bond Connectivities Prediction Using Phylogenetic Trees and a Simple Cysteine Mutation Model , 2015, PloS one.

[55]  R. Gibbs,et al.  Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies. , 2015, Human molecular genetics.

[56]  M. Vihinen,et al.  PON-P2: Prediction Method for Fast and Reliable Identification of Harmful Variants , 2015, PloS one.

[57]  Tom Lenaerts,et al.  Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects , 2016, Bioinform..