GWASdb: a database for human genetic variants identified by genome-wide association studies

Recent advances in genome-wide association studies (GWAS) have enabled us to identify thousands of genetic variants (GVs) that are associated with human diseases. As next-generation sequencing technologies become less expensive, more GVs will be discovered in the near future. Existing databases, such as NHGRI GWAS Catalog, collect GVs with only genome-wide level significance. However, many true disease susceptibility loci have relatively moderate P values and are not included in these databases. We have developed GWASdb that contains 20 times more data than the GWAS Catalog and includes less significant GVs (P < 1.0 × 10−3) manually curated from the literature. In addition, GWASdb provides comprehensive functional annotations for each GV, including genomic mapping information, regulatory effects (transcription factor binding sites, microRNA target sites and splicing sites), amino acid substitutions, evolution, gene expression and disease associations. Furthermore, GWASdb classifies these GVs according to diseases using Disease-Ontology Lite and Human Phenotype Ontology. It can conduct pathway enrichment and PPI network association analysis for these diseases. GWASdb provides an intuitive, multifunctional database for biologists and clinicians to explore GVs and their functional inferences. It is freely available at http://jjwanglab.org/gwasdb and will be updated frequently.

[1]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[2]  Sridhar Hannenhalli,et al.  A mammalian promoter model links cis elements to genetic networks. , 2006, Biochemical and biophysical research communications.

[3]  Hagit Shatkay,et al.  F-SNP: computationally predicted functional SNPs for disease association studies , 2007, Nucleic Acids Res..

[4]  Anjali J. Koppal,et al.  Supplementary data: Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites , 2010 .

[5]  Wei Zhang,et al.  SCAN: SNP and copy number annotation , 2010, Bioinform..

[6]  Margaret A. Pericak-Vance,et al.  SNPselector: a web tool for selecting SNPs for genetic association studies , 2005, Bioinform..

[7]  Mi Zhou,et al.  CTCFBSDB: a CTCF-binding site database for characterization of vertebrate genomic insulators , 2007, Nucleic Acids Res..

[8]  Sandeep J. Joseph,et al.  SNPxGE2: a database for human 3-way SNP-expression associations , 2011 .

[9]  Christine Fong,et al.  GWAS Analyzer: integrating genotype, phenotype and public annotation data for genome-wide association study analysis , 2010, Bioinform..

[10]  Jong Bhak,et al.  ssSNPTarget: genome‐wide splice‐site single nucleotide polymorphism database , 2009, Human mutation.

[11]  Gary D. Bader,et al.  Cytoscape Web: an interactive web-based network browser , 2010, Bioinform..

[12]  P. Sham,et al.  A Knowledge-Based Weighting Framework to Boost the Power of Genome-Wide Association Studies , 2010, PloS one.

[13]  Arshad Khan,et al.  SNPnexus: a web database for functional annotation of newly discovered and public domain single nucleotide polymorphisms , 2008, Bioinform..

[14]  Wei Chen,et al.  SNP@Evolution: a hierarchical database of positive selection on the human genome , 2009, BMC Evolutionary Biology.

[15]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[16]  Life Technologies,et al.  A map of human genome variation from population-scale sequencing , 2011 .

[17]  Charles Rotimi,et al.  A Genome-Wide Association Study of Hypertension and Blood Pressure in African Americans , 2009, PLoS genetics.

[18]  Russ B Altman,et al.  PharmGKB: a logical home for knowledge relating genotype to drug response phenotype , 2007, Nature Genetics.

[19]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[20]  Michael Q. Zhang,et al.  ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor , 2011, Nucleic Acids Res..

[21]  Mostafa Ronaghi,et al.  pfSNP: An integrated potentially functional SNP resource that facilitates hypotheses generation through knowledge syntheses , 2011, Human mutation.

[22]  Michael Kertesz,et al.  The role of site accessibility in microRNA target recognition , 2007, Nature Genetics.

[23]  Alexander R. Pico,et al.  SNPLogic: an interactive single nucleotide polymorphism selection, annotation, and prioritization system , 2008, Nucleic Acids Res..

[24]  Pak Chung Sham,et al.  FastPval: a fast and memory efficient program to calculate very low P-values from empirical distribution , 2010, Bioinform..

[25]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[26]  P. Robinson,et al.  The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. , 2008, American journal of human genetics.

[27]  Gang Feng,et al.  From disease ontology to disease-ontology lite: statistical methods to adapt a general-purpose ontology for the test of gene-ontology associations , 2009, Bioinform..

[28]  Inna Dubchak,et al.  VISTA Enhancer Browser—a database of tissue-specific human enhancers , 2006, Nucleic Acids Res..

[29]  K. Sirotkin,et al.  The NCBI dbGaP database of genotypes and phenotypes , 2007, Nature Genetics.

[30]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[31]  Ewa Deelman,et al.  New tools and methods for direct programmatic access to the dbSNP relational database , 2010, Nucleic Acids Res..

[32]  Martha L. Bulyk,et al.  UniPROBE: an online database of protein binding microarray data on protein–DNA interactions , 2008, Nucleic Acids Res..

[33]  Debasis Dash,et al.  HGVbaseG2P: a central genetic association database , 2008, Nucleic Acids Res..

[34]  Laura J. Scott,et al.  SNP Function Portal: a web database for exploring the function implication of SNP alleles , 2006, ISMB.

[35]  Andrew D. Johnson,et al.  Bmc Medical Genetics an Open Access Database of Genome-wide Association Results , 2009 .

[36]  N. Campbell Genetic association database , 2004, Nature Reviews Genetics.

[37]  Hung Tseng,et al.  Search for basonuclin target genes. , 2006, Biochemical and biophysical research communications.

[38]  Wei Huang,et al.  Comprehensive pathway-based association study of DNA repair gene variants and the risk of nasopharyngeal carcinoma. , 2011, Cancer research.

[39]  N. Cox,et al.  Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS , 2010, PLoS genetics.

[40]  E. Boerwinkle,et al.  dbNSFP: A Lightweight Database of Human Nonsynonymous SNPs and Their Functional Predictions , 2011, Human mutation.

[41]  Peter Tarczy-Hornoch,et al.  SNPit: A federated data integration system for the purpose of functional SNP annotation , 2009, Comput. Methods Programs Biomed..

[42]  Jussi Paananen,et al.  Varietas: a functional variation database portal , 2010, Database J. Biol. Databases Curation.

[43]  Mingxia Zhang,et al.  An SNP selection strategy identified IL-22 associating with susceptibility to tuberculosis in Chinese , 2011, Scientific reports.

[44]  M. Khoury,et al.  A navigator for human genome epidemiology , 2008, Nature Genetics.