Systematic biological prioritization after a genome-wide association study: an application to nicotine dependence

MOTIVATION A challenging problem after a genome-wide association study (GWAS) is to balance the statistical evidence of genotype-phenotype correlation with a priori evidence of biological relevance. RESULTS We introduce a method for systematically prioritizing single nucleotide polymorphisms (SNPs) for further study after a GWAS. The method combines evidence across multiple domains including statistical evidence of genotype-phenotype correlation, known pathways in the pathologic development of disease, SNP/gene functional properties, comparative genomics, prior evidence of genetic linkage, and linkage disequilibrium. We apply this method to a GWAS of nicotine dependence, and use simulated data to test it on several commercial SNP microarrays. AVAILABILITY A comprehensive database of biological prioritization scores for all known SNPs is available at http://zork.wustl.edu/gin. This can be used to prioritize nicotine dependence association studies through a straightforward mathematical formula-no special software is necessary. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  P. Donnelly,et al.  Replicating genotype–phenotype associations , 2007, Nature.

[2]  Larry Wasserman,et al.  Using linkage genome scans to improve power of association in genome scans. , 2006, American journal of human genetics.

[3]  Leena Peltonen,et al.  Genetic linkage to chromosome 22q12 for a heavy-smoking quantitative trait in two independent samples. , 2007, American journal of human genetics.

[4]  Luca Benini,et al.  TOM: enhancement and extension of a tool suite for in silico approaches to multigenic hereditary disorders , 2008, Bioinform..

[5]  Russ B Altman,et al.  PharmGKB: a logical home for knowledge relating genotype to drug response phenotype , 2007, Nature Genetics.

[6]  Mark Daly,et al.  Haploview: analysis and visualization of LD and haplotype maps , 2005, Bioinform..

[7]  Daniel F. Gudbjartsson,et al.  A variant associated with nicotine dependence, lung cancer and peripheral arterial disease , 2008, Nature.

[8]  A Elofsson,et al.  Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. , 1997, Protein engineering.

[9]  J. T. Kadonaga,et al.  *To whom correspondence should be addressed. E- , 2022 .

[10]  Gary K. Chen,et al.  Enriching the analysis of genomewide association studies with hierarchical modeling. , 2007, American journal of human genetics.

[11]  Chia-Hung Liu,et al.  FASTSNP: an always up-to-date and extendable service for SNP function analysis and prioritization , 2006, Nucleic Acids Res..

[12]  D. Curtis,et al.  A pragmatic suggestion for dealing with results for candidate genes obtained from genome wide association studies , 2007, BMC Genetics.

[13]  Paolo Vineis,et al.  A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25 , 2008, Nature.

[14]  Marcia M. Nizzari,et al.  Genome-Wide Association Analysis Identifies Loci for Type 2 Diabetes and Triglyceride Levels , 2007, Science.

[15]  Jing Chen,et al.  PolyDoms: a whole genome database for the identification of non-synonymous coding SNPs with the potential to impact disease , 2006, Nucleic Acids Res..

[16]  Hagit Shatkay,et al.  F-SNP: computationally predicted functional SNPs for disease association studies , 2007, Nucleic Acids Res..

[17]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[18]  Duncan C Thomas,et al.  High-volume "-omics" technologies and the future of molecular epidemiology. , 2006, Epidemiology.

[19]  D. Thomas,et al.  The Need for a Systematic Approach to Complex Pathways in Molecular Epidemiology , 2005, Cancer Epidemiology Biomarkers & Prevention.

[20]  David J. Porteous,et al.  SUSPECTS : enabling fast and effective prioritization of positional candidates , 2005 .

[21]  D. Thomas,et al.  Are We Ready for Genome-wide Association Studies? , 2006, Cancer Epidemiology Biomarkers & Prevention.

[22]  H. Wit Faculty Opinions recommendation of Alpha-5/alpha-3 nicotinic receptor subunit alleles increase risk for heavy smoking. , 2008 .

[23]  P. Muglia,et al.  α-5/α-3 nicotinic receptor subunit alleles increase risk for heavy smoking , 2008, Molecular Psychiatry.

[24]  Scott F. Saccone,et al.  Novel genes identified in a high-density genome wide association study for nicotine dependence. , 2007, Human molecular genetics.

[25]  Christian von Mering,et al.  STRING 7—recent developments in the integration and prediction of protein interactions , 2006, Nucleic Acids Res..

[26]  Masaru Tomita,et al.  KEGG-Based Pathway Visualization Tool for Complex Omics Data , 2005, Silico Biol..

[27]  Tatiana Foroud,et al.  Variants in nicotinic receptors and risk for nicotine dependence. , 2008, The American journal of psychiatry.

[28]  C. Ouzounis,et al.  Expansion of the BioCyc collection of pathway/genome databases to 160 genomes , 2005, Nucleic acids research.

[29]  J. Zieleński,et al.  Cystic fibrosis: genotypic and phenotypic variations. , 1995, Annual review of genetics.

[30]  James W Baurley,et al.  Hierarchical Bayes prioritization of marker associations from a genome‐wide association scan for further investigation , 2007, Genetic epidemiology.

[31]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[32]  Nicholas G Martin,et al.  Cholinergic nicotinic receptor genes implicated in a nicotine dependence association study targeting 348 candidate genes with 3713 SNPs. , 2007, Human molecular genetics.

[33]  G. Mills,et al.  Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1 , 2008, Nature Genetics.

[34]  Karen L. Mohlke,et al.  Data and text mining A computational system to select candidate genes for complex human traits , 2007 .

[35]  C.-C. Jay Kuo,et al.  Sequence-based prioritization of nonsynonymous single-nucleotide polymorphisms for the study of disease mutations. , 2007, American journal of human genetics.

[36]  Nan Guo,et al.  PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways , 2006, Nucleic Acids Res..

[37]  Ivan Ovcharenko,et al.  ECRbase: database of evolutionary conserved regions, promoters, and transcription factor binding sites in vertebrate genomes , 2007, Bioinform..

[38]  Pierre Baldi,et al.  Global landscape of recent inferred Darwinian selection for Homo sapiens , 2006, Proc. Natl. Acad. Sci. USA.