Detecting genetic variation in microarray expression data.

The use of high-density oligonucleotide arrays to measure the expression levels of thousands of genes in parallel has become commonplace. To take further advantage of the growing body of data, we developed a method, termed "GeSNP," to mine the detailed hybridization patterns in oligonucleotide array expression data for evidence of genetic variation. To demonstrate the performance of the algorithm, the hybridization patterns in data obtained previously from SAMP8/Ta, SAMP10/Ta, and SAMR1/Ta inbred mice and from humans and chimpanzees were analyzed. Genes with consistent strain-specific and species-specific hybridization pattern differences were identified, and approximately 90% of the candidate genes were independently confirmed to harbor sequence differences. Importantly, the quality of gene expression data was also improved by masking the probes of regions with putative sequence differences between species and strains. To illustrate the application to human disease groups, data from an inflammatory bowel disease study were analyzed. GeSNP identified sequence differences in candidate genes previously discovered in independent association and linkage studies and uncovered many promising new candidates. This approach enables the opportunistic extraction of genetic variation information from new or pre-existing gene expression data obtained with high-density oligonucleotide arrays.

[1]  Shigeo Yoshida,et al.  Mechanisms of aging in senescence-accelerated mice , 2005, Genome Biology.

[2]  E. Petretto,et al.  Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease , 2005, Nature Genetics.

[3]  S. P. Fodor,et al.  Light-directed, spatially addressable parallel chemical synthesis. , 1991, Science.

[4]  F. Bredin,et al.  Evidence for association of OCTN genes and IBD5 with ulcerative colitis , 2005, Gut.

[5]  L. Wodicka,et al.  Regional and strain-specific gene expression mapping in the adult mouse brain. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[7]  J. Hacia,et al.  Comparative analysis of gene-expression patterns in human and African great ape cultured fibroblasts. , 2003, Genome research.

[8]  S. P. Fodor,et al.  Light-generated oligonucleotide arrays for rapid DNA sequence analysis. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Andrew I Su,et al.  Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics' , 2005, Nature Genetics.

[10]  S. Pääbo,et al.  Intra- and Interspecific Variation in Primate Gene Expression Patterns , 2002, Science.

[11]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  E S Lander,et al.  Genomewide search in Canadian families with inflammatory bowel disease reveals two novel susceptibility loci. , 2000, American journal of human genetics.

[13]  Hanlee P. Ji,et al.  The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. , 2006, Nature biotechnology.

[14]  J. Hacia,et al.  Improving the sensitivity and specificity of gene expression analysis in highly related organisms through the use of electronic masks. , 2004, Nucleic acids research.

[15]  N. Camp,et al.  Association of the interleukin 1 receptor antagonist gene with ulcerative colitis in Northern European Caucasians , 2001, Gut.

[16]  M. Cotreau,et al.  Molecular classification of Crohn's disease and ulcerative colitis patients using transcriptional profiles in peripheral blood mononuclear cells. , 2006, The Journal of molecular diagnostics : JMD.

[17]  Bill Newman,et al.  Functional variants of OCTN cation transporter genes are associated with Crohn disease , 2004, Nature Genetics.

[18]  S. Targan,et al.  A genome-wide search identifies potential new susceptibility loci for Crohn's disease , 1999 .

[19]  James L. Winkler,et al.  Accessing Genetic Information with High-Density DNA Arrays , 1996, Science.

[20]  E. Eichler,et al.  Regional patterns of gene expression in human and chimpanzee brains. , 2004, Genome research.

[21]  Sudhir Kumar,et al.  MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment , 2004, Briefings Bioinform..

[22]  D. Geschwind,et al.  Mice, microarrays, and the genetic diversity of the brain. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[23]  T. Jatkoe,et al.  Predicting splice variant from DNA chip expression data. , 2001, Genome research.

[24]  Detlef Weigel,et al.  Large-scale identification of single-feature polymorphisms in complex genomes. , 2003, Genome research.

[25]  M. Gazouli,et al.  Association between polymorphisms in the Toll-like receptor 4, CD14, and CARD15/NOD2 and inflammatory bowel disease in the Greek population. , 2005, World journal of gastroenterology.

[26]  K. Aldape,et al.  A model of molecular interactions on short oligonucleotide microarrays , 2003, Nature Biotechnology.

[27]  Daniel R. Richards,et al.  Direct allelic variation scanning of the yeast genome. , 1998, Science.

[28]  T. Speed,et al.  Summaries of Affymetrix GeneChip probe level data. , 2003, Nucleic acids research.

[29]  M. Eller,et al.  Mechanisms of aging. , 2002, Archives of dermatology.

[30]  M. Festing,et al.  Genetic typing of the Senescence-Accelerated Mouse (SAM) strains with microsatellite markers , 1999, Mammalian Genome.

[31]  S. P. Fodor,et al.  Detection of heterozygous mutations in BRCA1 using high density oligonucleotide arrays and two–colour fluorescence analysis , 1996, Nature Genetics.

[32]  T. Takeda Senescence-accelerated mouse (SAM): a biogerontological resource in aging research , 1999, Neurobiology of Aging.

[33]  Sinead B. O'Leary,et al.  Genetic variation in the 5q31 cytokine gene cluster confers susceptibility to Crohn disease , 2001, Nature Genetics.

[34]  Maitreya J. Dunham,et al.  Genome-Wide Detection of Polymorphisms at Nucleotide Resolution with a Single DNA Microarray , 2006, Science.

[35]  S. P. Fodor,et al.  High density synthetic oligonucleotide arrays , 1999, Nature Genetics.

[36]  Robert W. Williams,et al.  Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function , 2005, Nature Genetics.

[37]  S. Kay,et al.  Rapid Array Mapping of Circadian Clock and Developmental Mutations in Arabidopsis1 , 2005, Plant Physiology.

[38]  Robert W. Williams,et al.  Prediction of cis-QTLs in a pair of inbred mouse strains with the use of expression and haplotype data from public databases , 2006, Mammalian Genome.

[39]  L. Kruglyak,et al.  Simultaneous genotyping, gene-expression measurement, and detection of allele-specific expression with oligonucleotide arrays. , 2005, Genome research.

[40]  Eric E Schadt,et al.  DNA variation and brain region-specific expression profiles exhibit different relationships between inbred mouse strains: implications for eQTL mapping studies , 2007, Genome Biology.

[41]  E. Nimmo,et al.  Novel susceptibility genes in inflammatory bowel disease. , 2006, World journal of gastroenterology.

[42]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[43]  E. Winzeler,et al.  Genomics, gene expression and DNA arrays , 2000, Nature.

[44]  Thomas R. Sutter,et al.  How replicable are mRNA expression QTL? , 2006, Mammalian Genome.

[45]  S. P. Fodor,et al.  Multiplexed biochemical assays with biological chips , 1993, Nature.

[46]  R. Sartor Mechanisms of Disease: pathogenesis of Crohn's disease and ulcerative colitis , 2006, Nature Clinical Practice Gastroenterology &Hepatology.

[47]  Eric E Schadt,et al.  Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels. , 2005, Genomics.

[48]  K. Van Steen,et al.  Deficient host-bacteria interactions in inflammatory bowel disease? The toll-like receptor (TLR)-4 Asp299gly polymorphism is associated with Crohn’s disease and ulcerative colitis , 2004, Gut.

[49]  L. Wodicka,et al.  Genome-wide expression monitoring in Saccharomyces cerevisiae , 1997, Nature Biotechnology.

[50]  M. Daly,et al.  IBD5 is a general risk factor for inflammatory bowel disease: replication of association with Crohn disease and identification of a novel association with ulcerative colitis. , 2003, American journal of human genetics.

[51]  Huiying Yang,et al.  Functional and ethnic association of allele 2 of the interleukin-1 receptor antagonist gene in ulcerative colitis. , 1999, Gastroenterology.

[52]  G. Peltz,et al.  In Silico Mapping of Complex Disease-Related Traits in Mice , 2001, Science.

[53]  Matthew A. Zapala,et al.  Elevated gene expression levels distinguish human from non-human primate brains , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[54]  T. Richmond,et al.  Mutation discovery in bacterial genomes: metronidazole resistance in Helicobacter pylori , 2005, Nature Methods.