SCAN: SNP and copy number annotation

MOTIVATION Genome-wide association studies (GWAS) generate relationships between hundreds of thousands of single nucleotide polymorphisms (SNPs) and complex phenotypes. The contribution of the traditionally overlooked copy number variations (CNVs) to complex traits is also being actively studied. To facilitate the interpretation of the data and the designing of follow-up experimental validations, we have developed a database that enables the sensible prioritization of these variants by combining several approaches, involving not only publicly available physical and functional annotations but also multilocus linkage disequilibrium (LD) annotations as well as annotations of expression quantitative trait loci (eQTLs). RESULTS For each SNP, the SCAN database provides: (i) summary information from eQTL mapping of HapMap SNPs to gene expression (evaluated by the Affymetrix exon array) in the full set of HapMap CEU (Caucasians from UT, USA) and YRI (Yoruba people from Ibadan, Nigeria) samples; (ii) LD information, in the case of a HapMap SNP, including what genes have variation in strong LD (pairwise or multilocus LD) with the variant and how well the SNP is covered by different high-throughput platforms; (iii) summary information available from public databases (e.g. physical and functional annotations); and (iv) summary information from other GWAS. For each gene, SCAN provides annotations on: (i) eQTLs for the gene (both local and distant SNPs) and (ii) the coverage of all variants in the HapMap at that gene on each high-throughput platform. For each genomic region, SCAN provides annotations on: (i) physical and functional annotations of all SNPs, genes and known CNVs within the region and (ii) all genes regulated by the eQTLs within the region. AVAILABILITY http://www.scandb.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  G. Abecasis,et al.  A general test of association for quantitative traits in nuclear families. , 2000, American journal of human genetics.

[2]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[3]  G. Abecasis,et al.  Pedigree tests of transmission disequilibrium , 2000, European Journal of Human Genetics.

[4]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[5]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[6]  C. Molony,et al.  Genetic analysis of genome-wide variation in human gene expression , 2004, Nature.

[7]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[8]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[9]  S. Hunt,et al.  Genome-Wide Associations of Gene Expression Variation in Humans , 2005, PLoS genetics.

[10]  Dan L Nicolae,et al.  Testing Untyped Alleles (TUNA)—applications to genome‐wide association studies , 2006, Genetic epidemiology.

[11]  D. Koller,et al.  Population genomics of human gene expression , 2007, Nature Genetics.

[12]  Joshua T. Burdick,et al.  Common genetic variants account for differences in gene expression among ethnic groups , 2007, Nature Genetics.

[13]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[14]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[15]  R. Alberts,et al.  Sequence Polymorphisms Cause Many False cis eQTLs , 2007, PloS one.

[16]  M. Dolan,et al.  The HapMap Resource is Providing New Insights into Ourselves and its Application to Pharmacogenomics , 2008, Bioinformatics and biology insights.

[17]  Tyson A. Clark,et al.  Genetic architecture of transcript-level variation in humans. , 2008, American journal of human genetics.

[18]  M. Dolan,et al.  Integrating epigenomics into pharmacogenomic studies , 2008, Pharmacogenomics and personalized medicine.

[19]  Tyson A. Clark,et al.  Evaluation of genetic variation contributing to differences in gene expression between populations. , 2008, American journal of human genetics.

[20]  M. Dolan,et al.  Beyond the HapMap Genotypic Data: Prospects of Deep Resequencing Projects. , 2008, Current bioinformatics.

[21]  N. Cox,et al.  SNPinProbe_1.0: A database for filtering out probes in the Affymetrix GeneChip® Human Exon 1.0 ST array potentially affected by SNPs , 2008, Bioinformation.