Cis-regulatory variations: A study of SNPs around genes showing cis-linkage in segregating mouse populations

BackgroundChanges in gene expression are known to be responsible for phenotypic variation and susceptibility to diseases. Identification and annotation of the genomic sequence variants that cause gene expression changes is therefore likely to lead to a better understanding of the cause of disease at the molecular level. In this study we investigate the pattern of single nucleotide polymorphisms (SNPs) in genes for which the mRNA levels show cis-genetic linkage (gene e xpression q uantitative t rait l oci mapping in cis, or cis-eQTLs) in segregating mouse populations. Such genes are expected to have polymorphisms near their physical location (cis-variations) that affect their mRNA levels by altering one or more of the cis-regulatory elements. This led us to characterize the SNPs in promoter (5 Kb upstream) and non-coding gene regions (introns and 5 Kb downstream) (cis-SNPs) and the effects they may have on putative transcription factor binding sites.ResultsWe demonstrate that the cis-e QTL genes (CEGs) have a significantly higher frequency of cis-SNPs compared to non-CEGs (when both sets are taken from the non-IBD regions, i.e. regions not identical by descent). Most CEGs having cis-SNPs do not contain these SNPs in the phylogenetically conserved regions. In those CEGs that contain cis-SNPs in the phylogenetically conserved regions, enrichment of cis-SNPs occurs both within and outside of the conserved sequences. A higher fraction of CEGs are also seen to harbor cis-SNP that affect predicted transcription factor binding sites, a likely consequence of the higher cis-SNPs density in these genes.ConclusionThis present study provides the first genome-wide investigation of the putative cis-regulatory variations in a large set of genes whose levels of expression give rise to cis-linkage in segregating mammalian populations. Our results provide insights into the challenges that exist in identifying polymorphisms regulating gene expression using bioinformatic sequence analysis approaches. The data provided herein should benefit future investigations in this area.

[1]  Eric E Schadt,et al.  Optimization of oligonucleotide arrays and RNA amplification protocols for analysis of transcript structure and alternative splicing , 2003, Genome Biology.

[2]  T. Hudson,et al.  Mapping common regulatory variants to human haplotypes. , 2005, Human molecular genetics.

[3]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[4]  E. Petretto,et al.  Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease , 2005, Nature Genetics.

[5]  E. Wingender,et al.  MATCH: A tool for searching transcription factor binding sites in DNA sequences. , 2003, Nucleic acids research.

[6]  Fuli Yu,et al.  Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay. , 2005, Genome research.

[7]  Vladimir Svetnik,et al.  A comprehensive transcript index of the human genome generated using microarrays and computational approaches , 2004, Genome Biology.

[8]  M. O’Donovan,et al.  A high proportion of polymorphisms in the promoters of brain expressed genes influences transcriptional activity. , 2004, Biochimica et biophysica acta.

[9]  T. Hudson,et al.  Mapping cis-acting regulatory variation in recombinant congenic strains. , 2006, Physiological genomics.

[10]  E. Schadt,et al.  Genetic and Genomic Analysis of a Fat Mass Trait with Complex Inheritance Reveals Marked Sex Specificity , 2006, PLoS genetics.

[11]  M. O’Donovan,et al.  Functional analysis of human promoter polymorphisms. , 2003, Human molecular genetics.

[12]  Mark Gurney,et al.  The gene encoding phosphodiesterase 4D confers risk of ischemic stroke , 2003, Nature Genetics.

[13]  E E Schadt,et al.  Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits , 2005, Nature Genetics.

[14]  P. Stenson,et al.  Human Gene Mutation Database (HGMD , 2003 .

[15]  W. Wasserman,et al.  Identification of functional SNPs in the 5-prime flanking sequences of human genes , 2005, BMC Genomics.

[16]  Stephen W. Edwards,et al.  Microarray Standard Data Set and Figures of Merit for Comparing Data Processing Methods and Experiment Designs , 2003, Bioinform..

[17]  Julia V Ponomarenko,et al.  rSNP_Guide: An integrated database‐tools system for studying SNPs and site‐directed mutations in transcription factor binding sites , 2002, Human mutation.

[18]  E. Schadt,et al.  Genetic loci determining bone density in mice with diet-induced atherosclerosis. , 2001, Physiological genomics.

[19]  E. Lander,et al.  Detection of regulatory variation in mouse genes , 2002, Nature Genetics.

[20]  Daniel Sinnett,et al.  A survey of genetic and epigenetic variation affecting human gene expression. , 2004, Physiological genomics.

[21]  D. Bell,et al.  Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes. , 2005, Toxicology and applied pharmacology.

[22]  G. Stormo,et al.  PromoLign: A database for upstream region analysis and SNPs , 2004, Human mutation.

[23]  F. Hu,et al.  A Common Genetic Variant Is Associated with Adult and Childhood Obesity , 2006, Science.

[24]  William H. Majoros,et al.  A Comparison of Whole-Genome Shotgun-Derived Mouse Chromosome 16 and the Human Genome , 2002, Science.

[25]  A. Chakravarti Single nucleotide polymorphisms: . . .to a future of genetic medicine , 2001, Nature.

[26]  C. Lawrence,et al.  Human-mouse genome comparisons to locate regulatory sites , 2000, Nature Genetics.

[27]  G. Wray,et al.  Abundant raw material for cis-regulatory evolution in humans. , 2002, Molecular biology and evolution.

[28]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[29]  Michael C O'Donovan,et al.  A high proportion of chromosome 21 promoter polymorphisms influence transcriptional activity. , 2004, Gene expression.

[30]  E. Eichler,et al.  Analysis of segmental duplications and genome assembly in the mouse. , 2004, Genome research.

[31]  D. Guhathakurta,et al.  Computational identification of transcriptional regulatory elements in DNA sequence , 2006, Nucleic acids research.

[32]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[33]  Arend Sidow,et al.  Genomic regulatory regions: insights from comparative sequence analysis. , 2003, Current opinion in genetics & development.

[34]  Joshua M. Stuart,et al.  A Gene Expression Map for Caenorhabditis elegans , 2001, Science.

[35]  Eric E Schadt,et al.  Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels. , 2005, Genomics.

[36]  Julian C. Knight,et al.  Regulatory polymorphisms underlying complex disease traits , 2005, Journal of Molecular Medicine.

[37]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[38]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.

[39]  S. Cawley,et al.  Unbiased Mapping of Transcription Factor Binding Sites along Human Chromosomes 21 and 22 Points to Widespread Regulation of Noncoding RNAs , 2004, Cell.

[40]  W. Miller,et al.  Distinguishing regulatory DNA from neutral sites. , 2003, Genome research.

[41]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[42]  D. Davison,et al.  d2_cluster: a validated method for clustering EST and full-length cDNAsequences. , 1999, Genome research.

[43]  M. Daly,et al.  Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome. , 2004, Genome research.

[44]  Gary Peltz,et al.  In Silico Genetics: Identification of a Functional Element Regulating H2-Eα Gene Expression , 2004, Science.

[45]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[46]  G. Stormo,et al.  Additivity in protein-DNA interactions: how good an approximation is it? , 2002, Nucleic acids research.

[47]  W. Miller,et al.  Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. , 2000, Science.

[48]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[49]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[50]  Michael C O'Donovan,et al.  Strong bias in the location of functional promoter polymorphisms , 2005, Human mutation.

[51]  Tom H. Lindner,et al.  Genetic variation in the gene encoding calpain-10 is associated with type 2 diabetes mellitus , 2000, Nature Genetics.

[52]  Joaquín Dopazo,et al.  PupaSNP Finder: a web tool for finding SNPs with putative effect at transcriptional level , 2004, Nucleic Acids Res..

[53]  Alexander E. Kel,et al.  MATCHTM: a tool for searching transcription factor binding sites in DNA sequences , 2003, Nucleic Acids Res..

[54]  David M Nathan,et al.  TCF7L2 polymorphisms and progression to diabetes in the Diabetes Prevention Program. , 2006, The New England journal of medicine.

[55]  P. Buckland Allele-specific gene expression differences in humans. , 2004, Human molecular genetics.

[56]  A. Sandelin,et al.  Identification of conserved regulatory elements by comparative genome analysis , 2003, Journal of biology.

[57]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[58]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[59]  M. Bulyk Computational prediction of transcription-factor binding site locations , 2003, Genome Biology.

[60]  Thomas J. Hudson,et al.  Cis-Acting Regulatory Variation in the Human Genome , 2004, Science.

[61]  Serge Batalov,et al.  Use of a Dense Single Nucleotide Polymorphism Map for In Silico Mapping in the Mouse , 2004, PLoS biology.

[62]  E. Conti,et al.  Nonsense-mediated mRNA decay: molecular insights and mechanistic variations across species. , 2005, Current opinion in cell biology.

[63]  D. S. Fields,et al.  Specificity, free energy and information content in protein-DNA interactions. , 1998, Trends in biochemical sciences.

[64]  E. Lander,et al.  Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. , 1989, Genetics.

[65]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[66]  Leonid Kruglyak,et al.  Local Regulatory Variation in Saccharomyces cerevisiae , 2005, PLoS genetics.

[67]  Sridhar Hannenhalli,et al.  Enrichment of regulatory signals in conserved non-coding genomic sequence , 2001, Bioinform..

[68]  A. Firulli A HANDful of questions: the molecular biology of the heart and neural crest derivatives (HAND)-subclass of basic helix-loop-helix transcription factors. , 2003, Gene.

[69]  G. Church,et al.  Computational identification of transcription factor binding sites via a transcription-factor-centric clustering (TFCC) algorithm. , 2002, Journal of molecular biology.

[70]  P. Stenson,et al.  Human Gene Mutation Database (HGMD®): 2003 update , 2003, Human mutation.

[71]  Eric E Schadt,et al.  Cis-acting expression quantitative trait loci in mice. , 2005, Genome research.

[72]  Andrew I Su,et al.  Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics' , 2005, Nature Genetics.

[73]  R. Stoughton,et al.  Experimental annotation of the human genome using microarray technology , 2001, Nature.

[74]  Sean D. Mooney,et al.  Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis , 2005, Briefings Bioinform..

[75]  E. Davidson,et al.  Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene. , 1998, Science.