Identification and functional analysis of human transcriptional promoters.

Genomic and full-length cDNA sequences provide opportunities for understanding human gene structure and transcriptional regulatory elements. The simplest regulatory elements to identify are promoters, as their positions are dictated by the location of transcription start sites. We aligned full-length cDNA clones from the Mammalian Gene Collection to the human genome rough draft sequence to estimate the start sites of more than 10,000 human transcripts. We selected genomic sequence just upstream from the 5' end of these cDNA sequences and designated these as putative promoters. We assayed the functions of 152 of these DNA fragments, chosen at random from the entire set, in a luciferase-based transfection assay in four human cultured cell types. Ninety-one percent of these DNA fragments showed significant transcriptional activity in at least one of the cell lines, whereas 89% showed activity in at least two of the lines. We analyzed the distributions of strengths of these promoter fragments in the different cell types and identified likely alternative promoters in a large fraction of the genes. These data indicate that this approach is an effective method for predicting human promoters and provide the first set of functional data collected in parallel for a large set of human promoters.

[1]  The human C4b-binding protein beta-chain gene. , 1993, The Journal of biological chemistry.

[2]  T. Ayoubi,et al.  Regulation of gene expression by alternative promoters , 1996, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[3]  D. Schaid,et al.  The genomic organization of human dystrobrevin , 1997, Neurogenetics.

[4]  N Goodman,et al.  A map of 75 human ribosomal protein genes. , 1998, Genome research.

[5]  G. Yousef,et al.  The New Kallikrein-like Gene, KLK-L2 , 1999, The Journal of Biological Chemistry.

[6]  R D Klausner,et al.  The mammalian gene collection. , 1999, Science.

[7]  D. Blake,et al.  Tissue-selective Expression of α-Dystrobrevin Is Determined by Multiple Promoters* , 1999, The Journal of Biological Chemistry.

[8]  T. Tsunoda,et al.  Identification and characterization of the potential promoter regions of 1031 kinds of human genes. , 2001, Genome research.

[9]  Y. Suzuki,et al.  Construction of full-length-enriched cDNA libraries. The oligo-capping method. , 2001, Methods in molecular biology.

[10]  H Niemann,et al.  Identification and analysis of eukaryotic promoters: recent computational approaches. , 2001, Trends in genetics : TIG.

[11]  Kenta Nakai,et al.  DBTSS: DataBase of human Transcriptional Start Sites and full-length cDNAs , 2002, Nucleic Acids Res..

[12]  Michael Ruogu Zhang,et al.  Computational identification of promoters and first exons in the human genome , 2002, Nature Genetics.

[13]  T. Hubbard,et al.  Computational detection and location of transcription start sites in mammalian genomic DNA. , 2002, Genome research.

[14]  Sumio Sugano,et al.  Construction of a full-length enriched and a 5'-end enriched cDNA library using the oligo-capping method. , 2003, Methods in molecular biology.