Distant conserved sequences flanking endothelial-specific promoters contain tissue-specific DNase-hypersensitive sites and over-represented motifs.

The transcriptional regulation of genes is a complex process, particularly for genes exhibiting a tissue-specific pattern of expression. We studied 28 genes that are expressed primarily in endothelial cells, another 28 genes that are expressed highly, but not exclusively, in cultured endothelial cells, and three control sets, consisting of genes not expressed in endothelium, genes expressed in neural tissues and housekeeping genes. For each gene, we identified conserved non-coding sequences (CNSs) of lengths 50 to >1000 nucleotides, located within the upstream intergenic region (from 500 to as far as 200 000 nucleotides upstream from the transcription start) or within the first intron. As a functional test, we assayed the CNSs from the set of endothelial cell-specific genes (EC-CNSs) for DNase hypersensitivity. Among 262 distant EC-CNSs, 33% are hypersensitive (HS) in endothelial cells, whereas only 16% are HS in control fibroblasts. A search for short sequence patterns revealed a number of motifs which are over-represented in EC-CNSs relative to CNSs from the control gene sets. In particular, the motif SAGGAAR is strongly and consistently over-represented among EC-CNSs, and is more over-represented in HS CNSs than in non-HS CNSs. CNSs which contain this motif are no closer to the promoter than an average CNS. This motif contains the core element of binding sites from the Ets family of transcription factors. Thus, one or several factors from this family may play a key role in the regulation of endothelial gene expression.

[1]  A. Vinogradov Compactness of human housekeeping genes: selection for economy or genomic design? , 2004, Trends in genetics : TIG.

[2]  B. Black,et al.  Mef2c is activated directly by Ets transcription factors through an evolutionarily conserved endothelial cell-specific enhancer. , 2004, Developmental biology.

[3]  Alice Young,et al.  Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[5]  V. de Waard,et al.  Tissue distribution and regulation of murine von Willebrand factor gene expression in vivo. , 1998, Blood.

[6]  Michael Q. Zhang,et al.  Identifying tissue-selective transcription factor binding sites in vertebrate promoters. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[7]  S. Elgin,et al.  The chromatin structure of specific genes: II. Disruption of chromatin structure during gene activity , 1979, Cell.

[8]  Alexey S Kondrashov,et al.  Classification of common conserved sequences in mammalian intergenic regions. , 2002, Human molecular genetics.

[9]  Shamil Sunyaev,et al.  Small fitness effect of mutations in highly conserved non-coding regions. , 2005, Human molecular genetics.

[10]  Ton Feuth,et al.  Normalization of gene expression measurements in tumor tissues: comparison of 13 endogenous control genes , 2005, Laboratory Investigation.

[11]  Carl Wu The 5′ ends of Drosophila heat shock genes in chromatin are hypersensitive to DNase I , 1980, Nature.

[12]  T. Mahmoudi,et al.  DNA looping induced by a transcriptional enhancer in vivo , 2005, Nucleic acids research.

[13]  N. Janel,et al.  Ets transcription factors bind and transactivate the core promoter of the  von Willebrand factor gene , 1997, Oncogene.

[14]  Walter Schaffner,et al.  Conservation of Glutamine-Rich Transactivation Function between Yeast and Humans , 2000, Molecular and Cellular Biology.

[15]  A. Ogurtsov,et al.  Selective constraint in intergenic regions of human and mouse genomes. , 2001, Trends in genetics : TIG.

[16]  Xiangdong Fang,et al.  Locus control regions. , 2002, Blood.

[17]  T. Hudson,et al.  Control genes and variability: absence of ubiquitous reference transcripts in diverse mammalian expression studies. , 2002, Genome research.

[18]  E. Jaffe,et al.  Synthesis of factor VIII antigen by cultured guinea pig megakaryocytes. , 1977, The Journal of clinical investigation.

[19]  Qiliang Li,et al.  Locus control regions: coming of age at a decade plus. , 1999, Trends in genetics : TIG.

[20]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[21]  B. Vandenbunder,et al.  The Ets family contains transcriptional activators and repressors involved in angiogenesis. , 2001, The international journal of biochemistry & cell biology.

[22]  G. Stamatoyannopoulos,et al.  Quantification of DNaseI-sensitivity by real-time PCR: quantitative analysis of DNaseI-hypersensitivity of the mouse beta-globin LCR. , 2001, Journal of molecular biology.

[23]  F. Speleman,et al.  Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes , 2002, Genome Biology.

[24]  W. Aird,et al.  HEMOSTASIS , THROMBOSIS , AND VASCULAR BIOLOGY Characterization of the Mouse von Willebrand Factor Promoter , 1999 .

[25]  S. Carroll,et al.  The regulatory content of intergenic DNA shapes genome architecture , 2004, Genome Biology.

[26]  Shamil Sunyaev,et al.  Evolutionary constraints in conserved nongenic sequences of mammals. , 2005, Genome research.

[27]  Aleksey Y. Ogurtsov,et al.  OWEN: aligning long collinear regions of genomes , 2002, Bioinform..

[28]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[29]  Alexandre Reymond,et al.  Evolutionary Discrimination of Mammalian Conserved Non-Genic Sequences (CNGs) , 2003, Science.

[30]  E. Jaffe,et al.  Synthesis of antihemophilic factor antigen by cultured human endothelial cells. , 1973, The Journal of clinical investigation.

[31]  George Matcuk,et al.  Identification of endothelial cell genes by combined database mining and microarray analysis. , 2003, Physiological genomics.