Testing computational prediction of missense mutation phenotypes: Functional characterization of 204 mutations of human cystathionine beta synthase

Predicting the phenotypes of missense mutations uncovered by large‐scale sequencing projects is an important goal in computational biology. High‐confidence predictions can be an aid in focusing experimental and association studies on those mutations most likely to be associated with causative relationships between mutation and disease. As an aid in developing these methods further, we have derived a set of random mutations of the enzymatic domains of human cystathionine beta synthase. This enzyme is a dimeric protein that catalyzes the condensation of serine and homocysteine to produce cystathionine. Yeast missing this enzyme cannot grow on medium lacking a source of cysteine, while transfection of functional human CBS into yeast strains missing endogenous enzyme can successfully complement for the missing gene. We used PCR mutagenesis with error‐prone Taq polymerase to produce 948 colonies and compared cell growth in the presence or absence of a cysteine source as a measure of CBS function. We were able to infer the phenotypes of 204 single‐site mutants, 79 of them deleterious and 125 neutral. This set was used to test the accuracy of six publicly available prediction methods for phenotype prediction of missense mutations: SIFT, PolyPhen, PMut, SNPs3D, PhD‐SNP, and nsSNPAnalyzer. The top methods are PolyPhen, SIFT, and nsSNPAnalyzer, which have similar performance. Using kernel discriminant functions, we found that the difference in position‐specific scoring matrix values is more predictive than the wild‐type PSSM score alone, and that the relative surface area in the biologically relevant complex is more predictive than that of the monomeric proteins. Proteins 2010. © 2010 Wiley‐Liss, Inc.

[1]  J H Miller,et al.  Genetic studies of the lac repressor. XIII. Extensive amino acid replacements generated by the use of natural and synthetic nonsense suppressors. , 1990, Journal of molecular biology.

[2]  Kee-Hoon Kang,et al.  Bandwidth choice for nonparametric classification , 2005 .

[3]  F. Collins,et al.  New goals for the U.S. Human Genome Project: 1998-2003. , 1998, Science.

[4]  J. Moult,et al.  Identification and analysis of deleterious human SNPs. , 2006, Journal of molecular biology.

[5]  M. Orozco,et al.  Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. , 2002, Journal of molecular biology.

[6]  Jeffrey Miller,et al.  Genetic Studies of Lac Repressor: 4000 Single Amino Acid Substitutions and Analysis of the Resulting Phenotypes on the Basis of the Protein Structure , 1996, German Conference on Bioinformatics.

[7]  J. Kraus,et al.  Trypsin cleavage of human cystathionine beta-synthase into an evolutionarily conserved active core: structural and functional consequences. , 1998, Archives of biochemistry and biophysics.

[8]  S. Henikoff,et al.  Predicting deleterious amino acid substitutions. , 2001, Genome research.

[9]  A. Pinchera,et al.  BRAFV599E Mutation Is the Leading Genetic Event in Adult Sporadic Papillary Thyroid Carcinomas , 2004 .

[10]  C Cruz,et al.  Genetic studies of the lac repressor. XIV. Analysis of 4000 altered Escherichia coli lac repressors reveals essential and non-essential residues, as well as "spacers" which do not require a specific sequence. , 1994, Journal of molecular biology.

[11]  A. Takashima,et al.  Twenty-nine missense mutations linked with familial Alzheimer's disease alter the processing of presenilin 1 , 1999, Progress in Neuro-Psychopharmacology and Biological Psychiatry.

[12]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[13]  Marianne Manchester,et al.  Complete mutagenesis of the HIV-1 protease , 1989, Nature.

[14]  S. Henikoff,et al.  Accounting for human polymorphisms predicted to affect protein function. , 2002, Genome research.

[15]  Zoran Obradovic,et al.  Statistical analysis of interface similarity in crystals of homologous proteins. , 2008, Journal of molecular biology.

[16]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.

[17]  Modesto Orozco,et al.  PMUT: a web-based tool for the annotation of pathological mutations on proteins , 2005, Bioinform..

[18]  M. Lewis,et al.  A closer view of the conformation of the Lac repressor bound to operator , 2000, Nature Structural Biology.

[19]  G. Andria,et al.  Clinical aspects of cystathionine β-synthase deficiency: how wide is the spectrum? , 1998, European Journal of Pediatrics.

[20]  M. Macek,et al.  Mutation characterization of CFTR gene in 206 Northern Irish CF families: Thirty mutations, including two novel, account for ∼︁94% of CF chromosomes , 1996 .

[21]  Allen,et al.  Missense mutations in the insulin promoter factor-1 gene predispose to type 2 diabetes , 1999, The Journal of clinical investigation.

[22]  Patrice Koehl,et al.  The ASTRAL Compendium in 2004 , 2003, Nucleic Acids Res..

[23]  P. Burkhard,et al.  Structural insights into mutations of cystathionine beta-synthase. , 2003, Biochimica et biophysica acta.

[24]  A. Pinchera,et al.  BRAF(V599E) mutation is the leading genetic event in adult sporadic papillary thyroid carcinomas. , 2004, The Journal of clinical endocrinology and metabolism.

[25]  B. Matthews,et al.  Studies on protein stability with T4 lysozyme. , 1995, Advances in protein chemistry.

[26]  A. Bateman The structure of a domain common to archaebacteria and the homocystinuria disease protein. , 1997, Trends in biochemical sciences.

[27]  H. P. Wu,et al.  Screening for the Gly40Ser mutation in the glucagon receptor gene among patients with type 2 diabetes or essential hypertension in Taiwan. , 1999, Pancreas.

[28]  Emidio Capriotti,et al.  Bioinformatics Original Paper Predicting the Insurgence of Human Genetic Diseases Associated to Single Point Protein Mutations with Support Vector Machines and Evolutionary Information , 2022 .

[29]  P. Bork,et al.  Human non-synonymous SNPs: server and survey. , 2002, Nucleic acids research.

[30]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[31]  Adrian A Canutescu,et al.  A graph‐theory algorithm for rapid protein side‐chain prediction , 2003, Protein science : a publication of the Protein Society.

[32]  K. Jhee,et al.  The role of cystathionine beta-synthase in homocysteine metabolism. , 2005, Antioxidants & redox signaling.

[33]  A Bairoch,et al.  SWISS-PROT: connecting biomolecular knowledge via a protein database. , 2001, Current issues in molecular biology.

[34]  R. Gordon,et al.  Cystathionine beta-synthase mutations in homocystinuria. , 1999, Human mutation.

[35]  Yan P. Yuan,et al.  HGVbase: a human sequence variation database emphasizing data quality and a broad spectrum of data sources , 2002, Nucleic Acids Res..

[36]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[37]  Dagmar Ringe,et al.  Human cystathionine beta-synthase is a heme sensor protein. Evidence that the redox sensor is heme and not the vicinal cysteines in the CXXC motif seen in the crystal structure of the truncated enzyme. , 2002, Biochemistry.

[38]  J. Stephens,et al.  SNP and haplotype variation in the human genome. , 2003, Mutation research.

[39]  D. Shortle,et al.  Contributions of the ionizable amino acids to the stability of staphylococcal nuclease. , 1996, Biochemistry.

[40]  Ceslovas Venclovas,et al.  Progress over the first decade of CASP experiments , 2005, Proteins.

[41]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[42]  D. Shortle,et al.  Evidence for strained interactions between side-chains and the polypeptide backbone. , 1994, Journal of molecular biology.

[43]  M. A. McClure,et al.  Hidden Markov models of biological primary sequence information. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[44]  D. Cox,et al.  A yeast assay for functional detection of mutations in the human cystathionine beta-synthase gene. , 1995, Human molecular genetics.

[45]  D. Cooper,et al.  Human Gene Mutation Database , 1996, Human Genetics.

[46]  J H Miller,et al.  Lac repressor genetic map in real space. , 1997, Trends in biochemical sciences.

[47]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[48]  R. Gordon,et al.  Cystathionine β‐synthase mutations in homocystinuria , 1999 .

[49]  M. Vihinen,et al.  Crystal structure of a 1.6‐hexanediol bound tetrameric form of Escherichia coli Lac‐repressor refined to 2.1 Å resolution , 2008, Proteins.

[50]  David Haussler,et al.  Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology , 1996, Comput. Appl. Biosci..

[51]  J. Moult,et al.  Loss of protein structure stability as a major causative factor in monogenic disease. , 2005, Journal of molecular biology.

[52]  Hongyu Zhao,et al.  Haplotype analysis in population genetics and association studies. , 2003, Pharmacogenomics.

[53]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[54]  S. Bouvier,et al.  Systematic mutation of bacteriophage T4 lysozyme. , 1991, Journal of molecular biology.

[55]  S. Bell,et al.  Charting a course through RNA polymerase , 2000, Nature Structural Biology.

[56]  P. Bork,et al.  Towards a structural basis of human non-synonymous single nucleotide polymorphisms. , 2000, Trends in genetics : TIG.

[57]  G. Guanti,et al.  Clinical findings in a family with familial adenomatous polyposis and a missense mutation of the adenomatous polyposis coli gene. , 1996, Scandinavian journal of gastroenterology.

[58]  Mi Zhou,et al.  nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms , 2005, Nucleic Acids Res..

[59]  D. Shortle,et al.  Mutant forms of staphylococcal nuclease with altered patterns of guanidine hydrochloride and urea denaturation , 1986, Proteins.

[60]  Y. Wu,et al.  Missense alterations of BRCA1 gene detected in diverse cancer patients. , 2000, Anticancer research.

[61]  Zoran Obradovic,et al.  ProtBuD: a database of biological unit structures of protein families and superfamilies , 2006, Bioinform..

[62]  S. Mudd Disorders of transsulfuration , 1989 .

[63]  Yan Cui,et al.  Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information , 2005, Bioinform..

[64]  P. Burkhard,et al.  Structural insights into mutations of cystathionine β-synthase , 2003 .

[65]  P. Ueland,et al.  Functional modeling of vitamin responsiveness in yeast: a common pyridoxine-responsive cystathionine beta-synthase mutation in homocystinuria. , 1997, Human molecular genetics.

[66]  J. Thornton,et al.  Influence of proline residues on protein conformation. , 1991, Journal of molecular biology.

[67]  S. Tapscott,et al.  Sibling rivalry, arrested development and chromosomal mayhem , 1998, Nature Genetics.

[68]  D. Shortle,et al.  Contributions of the polar, uncharged amino acids to the stability of staphylococcal nuclease: evidence for mutational effects on the free energy of the denatured state. , 1992, Biochemistry.

[69]  G. Chang,et al.  Crystal Structure of the Lactose Operon Repressor and Its Complexes with DNA and Inducer , 1996, Science.

[70]  B. Dubois,et al.  Early-onset autosomal dominant Alzheimer disease: prevalence, genetic heterogeneity, and mutation spectrum. , 1999, American journal of human genetics.

[71]  F. Collins,et al.  The HapMap and genome-wide association studies in diagnosis and therapy. , 2009, Annual review of medicine.

[72]  Warren C. Lathe,et al.  Prediction of deleterious human alleles. , 2001, Human molecular genetics.

[73]  P Burkhard,et al.  Structure of human cystathionine β‐synthase: a unique pyridoxal 5′‐phosphate‐dependent heme protein , 2001, The EMBO journal.

[74]  M. Orozco,et al.  Sequence‐based prediction of pathological mutations , 2004, Proteins.

[75]  G. Ehrlich,et al.  The Metabolic Basis Of Inherited Disease. , 1973 .

[76]  G. Andria,et al.  Clinical aspects of cystathionine beta-synthase deficiency: how wide is the spectrum? The Italian Collaborative Study Group on Homocystinuria. , 1998, European journal of pediatrics.

[77]  Geoffrey I. Webb,et al.  An Experimental Evaluation of Integrating Machine Learning with Knowledge Acquisition , 1999, Machine Learning.

[78]  T. Kunkel,et al.  Discrimination against purine–pyrimidine mispairs in the polymerase active site of DNA polymerase I: A structural explanation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[79]  Damian Smedley,et al.  Ensembl 2004 , 2004, Nucleic Acids Res..

[80]  J. Miller,et al.  Mutations affecting the quaternary structure of the lac repressor. , 1976, The Journal of biological chemistry.

[81]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[82]  K. Jhee,et al.  The Role of Cystathionine β-Synthase in Homocysteine Metabolism , 2005 .

[83]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[84]  P. Goodfellow,et al.  Single missense mutation in the tyrosine kinase catalytic domain of the RET protooncogene is associated with multiple endocrine neoplasia type 2B. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[85]  Stephen P. Miller,et al.  Characterization of glucokinase mutations associated with maturity-onset diabetes of the young type 2 (MODY-2): different glucokinase defects lead to a common phenotype. , 1999, Diabetes.

[86]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[87]  A. Fersht,et al.  Active barnase variants with completely random hydrophobic cores. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[88]  P. Bork,et al.  Prediction of nonsynonymous single nucleotide polymorphisms in human disease-associated genes , 1999, Journal of Molecular Medicine.

[89]  P. Stenson,et al.  Human Gene Mutation Database (HGMD®): 2003 update , 2003, Human mutation.

[90]  L. Elsas,et al.  Cystathionine β‐synthase deficiency in Georgia (USA): Correlation of clinical and biochemical phenotype with genotype , 2003, Human mutation.

[91]  Sandor Vajda,et al.  CAPRI: A Critical Assessment of PRedicted Interactions , 2003, Proteins.

[92]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[93]  Roland L. Dunbrack,et al.  Mutations in the regulatory domain of cystathionine beta synthase can functionally suppress patient-derived mutations in cis. , 2001, Human molecular genetics.

[94]  P. Propping,et al.  Hereditary nonpolyposis colorectal cancer: causative role of a germline missense mutation in the hMLH1 gene confirmed by the independent occurrence of the same somatic mutation in tumour tissue , 1997, Human Genetics.

[95]  C. Broeckhoven,et al.  Missense mutation in exon 11 (codon 378) of the presenilin‐1 gene in a French family with early‐onset Alzheimer's disease and transmission study by mismatch enhanced allele specific amplification , 1998, Human mutation.

[96]  L. Loeb,et al.  Thermus aquaticus DNA Polymerase I Mutants with Altered Fidelity , 2000, The Journal of Biological Chemistry.