In Silico Analysis of Single Nucleotide Polymorphism (SNPs) in Human β-Globin Gene

Single amino acid substitutions in the globin chain are the most common forms of genetic variations that produce hemoglobinopathies- the most widespread inherited disorders worldwide. Several hemoglobinopathies result from homozygosity or compound heterozygosity to beta-globin (HBB) gene mutations, such as that producing sickle cell hemoglobin (HbS), HbC, HbD and HbE. Several of these mutations are deleterious and result in moderate to severe hemolytic anemia, with associated complications, requiring lifelong care and management. Even though many hemoglobinopathies result from single amino acid changes producing similar structural abnormalities, there are functional differences in the generated variants. Using in silico methods, we examined the genetic variations that can alter the expression and function of the HBB gene. Using a sequence homology-based Sorting Intolerant from Tolerant (SIFT) server we have searched for the SNPs, which showed that 200 (80%) non-synonymous polymorphism were found to be deleterious. The structure-based method via PolyPhen server indicated that 135 (40%) non-synonymous polymorphism may modify protein function and structure. The Pupa Suite software showed that the SNPs will have a phenotypic consequence on the structure and function of the altered protein. Structure analysis was performed on the key mutations that occur in the native protein coded by the HBB gene that causes hemoglobinopathies such as: HbC (E→K), HbD (E→Q), HbE (E→K) and HbS (E→V). Atomic Non-Local Environment Assessment (ANOLEA), Yet Another Scientific Artificial Reality Application (YASARA), CHARMM-GUI webserver for macromolecular dynamics and mechanics, and Normal Mode Analysis, Deformation and Refinement (NOMAD-Ref) of Gromacs server were used to perform molecular dynamics simulations and energy minimization calculations on β-Chain residue of the HBB gene before and after mutation. Furthermore, in the native and altered protein models, amino acid residues were determined and secondary structures were observed for solvent accessibility to confirm the protein stability. The functional study in this investigation may be a good model for additional future studies.

[1]  D. F. Roberts,et al.  Frequencies of hemoglobin variants : thalassemia, the glucose-6-phosphate dehydrogenase deficiency, G6PD variants, and ovalocytosis in human populations , 1986 .

[2]  P. Lane Sickle cell disease. , 1996, Pediatric clinics of North America.

[3]  H. Ozçelik,et al.  Identifying functional genetic variants in DNA repair pathway using protein conservation analysis. , 2004, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[4]  M. Cargill Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999, Nature Genetics.

[5]  M. Delarue,et al.  On the use of low-frequency normal modes to enforce collective movements in refining macromolecular structural models. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  M. Orozco,et al.  Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. , 2002, Journal of molecular biology.

[7]  Akinori Sarai,et al.  ProTherm, version 2.0: thermodynamic database for proteins and mutants , 2000, Nucleic Acids Res..

[8]  I. M. Jones,et al.  Many amino acid substitution variants identified in DNA repair genes during human population screenings are predicted to impact protein function. , 2004, Genomics.

[9]  L. Serrano,et al.  Protein aggregation and amyloidosis: confusion of the kinds? , 2006, Current opinion in structural biology.

[10]  L. Chasin,et al.  Human Genomic Sequences That Inhibit Splicing , 2000, Molecular and Cellular Biology.

[11]  B. Graveley Sorting out the complexity of SR protein functions. , 2000, RNA.

[12]  S. Henikoff,et al.  Predicting deleterious amino acid substitutions. , 2001, Genome research.

[13]  P. Bork,et al.  Human non-synonymous SNPs: server and survey. , 2002, Nucleic acids research.

[14]  J. Moult,et al.  SNPs, protein structure, and disease , 2001, Human mutation.

[15]  Andrew C. R. Martin,et al.  Human Mutation , 2020 .

[16]  Gert Vriend,et al.  Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces , 2010, BMC Bioinformatics.

[17]  Driscoll Mc Sickle Cell Disease , 2007 .

[18]  Patrice Koehl,et al.  NOMAD-Ref: visualization, deformation and refinement of macromolecular structures based on all-atom normal mode analysis , 2006, Nucleic Acids Res..

[19]  Joaquín Dopazo,et al.  PupaSuite: finding functional single nucleotide polymorphisms for large-scale genotyping purposes , 2006, Nucleic Acids Res..

[20]  Taehoon Kim,et al.  CHARMM‐GUI: A web‐based graphical user interface for CHARMM , 2008, J. Comput. Chem..

[21]  Anand P. Patil,et al.  Global distribution of the sickle cell gene and geographical confirmation of the malaria hypothesis , 2010, Nature communications.

[22]  R W Hockney,et al.  Computer Simulation Using Particles , 1966 .

[23]  T. Massingham,et al.  Detecting Amino Acid Sites Under Positive Selection and Purifying Selection , 2005, Genetics.

[24]  Andrew C. R. Martin,et al.  Mapping SNPs to protein sequence and structure data , 2005, Bioinform..

[25]  J. Emery,et al.  Genetics in Family Medicine: The Australian Handbook for General Practitioners , 2007 .

[26]  Joost Schymkowitz,et al.  Bioinformatics Applications Note Snpeffect V2.0: a New Step in Investigating the Molecular Phenotypic Effects of Human Non-synonymous Snps , 2022 .

[27]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[28]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[29]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[30]  E. Vichinsky Hemoglobin e syndromes. , 2007, Hematology. American Society of Hematology. Education Program.

[31]  L. Serrano,et al.  Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins , 2004, Nature Biotechnology.

[32]  S. Fucharoen,et al.  Hemoglobinopathies in Southeast Asia: molecular biology and clinical medicine. , 1997, Hemoglobin.

[33]  F. Melo,et al.  Assessing protein structures with a non-local atomic interaction energy. , 1998, Journal of molecular biology.

[34]  George P Patrinos,et al.  HbVar: A relational database of human hemoglobin variants and thalassemia mutations at the globin gene server , 2002, Human mutation.

[35]  R. Ljung,et al.  The thalassaemia syndromes , 2007, Scandinavian journal of clinical and laboratory investigation.

[36]  D Gilis,et al.  Stability changes upon mutation of solvent-accessible residues in proteins evaluated by database-derived potentials. , 1996, Journal of molecular biology.

[37]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[38]  A. Krainer,et al.  Listening to silence and understanding nonsense: exonic mutations that affect splicing , 2002, Nature Reviews Genetics.

[39]  C. George Priya Doss,et al.  Impact of single nucleotide polymorphisms in HBB gene causing haemoglobinopathies: in silico analysis. , 2009, New biotechnology.

[40]  K. Sirotkin,et al.  dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. , 1999, Genome research.

[41]  D Gilis,et al.  Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence. , 1997, Journal of molecular biology.