PhosSNP for Systematic Analysis of Genetic Polymorphisms That Influence Protein Phosphorylation*

We are entering the era of personalized genomics as breakthroughs in sequencing technology have made it possible to sequence or genotype an individual person in an efficient and accurate manner. Preliminary results from HapMap and other similar projects have revealed the existence of tremendous genetic variations among world populations and among individuals. It is important to delineate the functional implication of such variations, i.e. whether they affect the stability and biochemical properties of proteins. It is also generally believed that the genetic variation is the main cause for different susceptibility to certain diseases or different response to therapeutic treatments. Understanding genetic variation in the context of human diseases thus holds the promise for “personalized medicine.” In this work, we carried out a genome-wide analysis of single nucleotide polymorphisms (SNPs) that could potentially influence protein phosphorylation characteristics in human. Here, we defined a phosphorylation-related SNP (phosSNP) as a non-synonymous SNP (nsSNP) that affects the protein phosphorylation status. Using an in-house developed kinase-specific phosphorylation site predictor (GPS 2.0), we computationally detected that ∼70% of the reported nsSNPs are potential phosSNPs. More interestingly, ∼74.6% of these potential phosSNPs might also induce changes in protein kinase types in adjacent phosphorylation sites rather than creating or removing phosphorylation sites directly. Taken together, we proposed that a large proportion of the nsSNPs might affect protein phosphorylation characteristics and play important roles in rewiring biological pathways. Finally, all phosSNPs were integrated into the PhosSNP 1.0 database, which was implemented in JAVA 1.5 (J2SE 5.0). The PhosSNP 1.0 database is freely available for academic researchers.

[1]  BIOINFORMATICS APPLICATIONS NOTE Databases and ontologies Dasty3, a WEB framework for DAS , 2022 .

[2]  Tony Pawson,et al.  Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases , 2009, Science Signaling.

[3]  Yixue Li,et al.  SysPTM: A Systematic Resource for Proteomic Research on Post-translational Modifications* , 2009, Molecular & Cellular Proteomics.

[4]  Yu Xue,et al.  DOG 1.0: illustrator of protein domain structures , 2009, Cell Research.

[5]  Jong Bhak,et al.  An integrated database-pipeline system for studying single nucleotide polymorphisms and diseases , 2008, BMC Bioinformatics.

[6]  D. Armstrong,et al.  The human ERG1 channel polymorphism, K897T, creates a phosphorylation site that inhibits channel activity , 2008, Proceedings of the National Academy of Sciences.

[7]  Yu Xue,et al.  GPS 2.0, a Tool to Predict Kinase-specific Phosphorylation Sites in Hierarchy *S , 2008, Molecular & Cellular Proteomics.

[8]  Chi-Ying F. Huang,et al.  PhosphoPOINT: a comprehensive human kinase interactome and phospho-protein database , 2008, ECCB.

[9]  Keun-Joon Park,et al.  Genome-wide analysis to predict protein sequence variations that change phosphorylation sites or their corresponding kinases , 2008, Nucleic acids research.

[10]  Andrea Richter,et al.  RET Gly691Ser mutation is associated with primary vesicoureteral reflux in the French‐Canadian population from Quebec , 2008, Human mutation.

[11]  Teruyoshi Hishiki,et al.  The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts , 2007, Nucleic Acids Res..

[12]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2007, Nucleic Acids Res..

[13]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[14]  Kei Yura,et al.  coliSNP database server mapping nsSNPs on protein structures , 2007, Nucleic Acids Res..

[15]  P. Bork,et al.  Systematic Discovery of In Vivo Phosphorylation Networks , 2007, Cell.

[16]  Valentin A. Ilyin,et al.  Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways , 2007, Nucleic Acids Res..

[17]  Kwang-Hoon Chun,et al.  Regulation of cyclin-dependent kinase inhibitor p21WAF1/CIP1 by protein kinase Cδ-mediated phosphorylation , 2007, Apoptosis.

[18]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[19]  Jing Chen,et al.  PolyDoms: a whole genome database for the identification of non-synonymous coding SNPs with the potential to impact disease , 2006, Nucleic Acids Res..

[20]  S. Gammeltoft,et al.  Phosphoproteomics toolbox: Computational biology, protein chemistry and mass spectrometry , 2006, FEBS letters.

[21]  R. Körfer,et al.  Composite polymorphisms in the ryanodine receptor 2 gene associated with arrhythmogenic right ventricular cardiomyopathy. , 2006, Cardiovascular research.

[22]  J. Moult,et al.  Identification and analysis of deleterious human SNPs. , 2006, Journal of molecular biology.

[23]  D. Armstrong,et al.  Cyclosporin and Timothy syndrome increase mode 2 gating of CaV1.2 calcium channels through aberrant phosphorylation of S6 helices. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Bostjan Kobe,et al.  Substrate specificity of protein kinases and computational prediction of substrates. , 2005, Biochimica et biophysica acta.

[25]  Hugues Sicotte,et al.  SNP500Cancer: a public resource for sequence validation, assay development, and frequency analysis for genetic variation in candidate genes , 2005, Nucleic Acids Res..

[26]  H. Ozçelik,et al.  Phosphorylation states of cell cycle and DNA repair proteins can be altered by the nsSNPs , 2005, BMC Cancer.

[27]  Patrick Dumont,et al.  The Codon 47 Polymorphism in p53 Is Functionally Significant*[boxs] , 2005, Journal of Biological Chemistry.

[28]  Magnar Bjørås,et al.  Dynamic relocalization of hOGG1 during the cell cycle is disrupted in cells harbouring the hOGG1-Cys326 polymorphic variant , 2005, Nucleic acids research.

[29]  Geoffrey B. Nilsen,et al.  Whole-Genome Patterns of Common DNA Variation in Three Human Populations , 2005, Science.

[30]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[31]  François Stricher,et al.  SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs , 2004, Nucleic Acids Res..

[32]  L. Maquat Nonsense-mediated mRNA decay: splicing, translation and mRNP dynamics , 2004, Nature Reviews Molecular Cell Biology.

[33]  B. Roth,et al.  Identification of two serine residues essential for agonist-induced 5-HT2A receptor desensitization. , 2003, Biochemistry.

[34]  P. Stenson,et al.  Human Gene Mutation Database (HGMD®): 2003 update , 2003, Human mutation.

[35]  T. Hunter,et al.  The Protein Kinase Complement of the Human Genome , 2002, Science.

[36]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[37]  N. Blom,et al.  Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. , 1999, Journal of molecular biology.

[38]  L. Brooks,et al.  A DNA polymorphism discovery resource for research on human genetic variation. , 1998, Genome research.

[39]  N. Blom,et al.  Statistical analysis of protein kinase specificity determinants , 1998, FEBS letters.

[40]  L. Maquat,et al.  A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance. , 1998, Trends in biochemical sciences.

[41]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[42]  L. Pinna,et al.  How do protein kinases recognize their substrates? , 1996, Biochimica et biophysica acta.

[43]  T. Soderling,et al.  A structural basis for substrate specificities of protein Ser/Thr kinases: primary sequence preference of casein kinases I and II, NIMA, phosphorylase kinase, calmodulin-dependent kinase II, CDK5, and Erk1 , 1996, Molecular and cellular biology.

[44]  Nikolaj Blom,et al.  Kinase-specific prediction of protein phosphorylation sites. , 2009, Methods in molecular biology.

[45]  Simon Kasif,et al.  topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association , 2004, Nucleic Acids Res..

[46]  P. Stenson,et al.  Human Gene Mutation Database (HGMD , 2003 .

[47]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[48]  Nikolaj Blom,et al.  PhosphoBase, a database of phosphorylation sites: release 2.0 , 1999, Nucleic Acids Res..

[49]  E. Lander,et al.  Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999, Nature Genetics.