Differentiation of core promoter architecture between plants and mammals revealed by LDSS analysis

Mammalian promoters are categorized into TATA and CpG-related groups, and they have complementary roles associated with differentiated transcriptional characteristics. While the TATA box is also found in plant promoters, it is not known if CpG-type promoters exist in plants. Plant promoters contain Y Patches (pyrimidine patches) in the core promoter region, and the ubiquity of these beyond higher plants is not understood as well. Sets of promoter sequences were utilized for the analysis of local distribution of short sequences (LDSS), and approximately one thousand octamer sequences have been identified as promoter constituents from Arabidopsis, rice, human and mouse, respectively. Based on their localization profiles, the identified octamer sequences were classified into several major groups, REG (Regulatory Element Group), TATA box, Inr (Initiator), Kozak, CpG and Y Patch. Comparison of the four species has revealed three categories: (i) shared groups found in both plants and mammals (TATA box), (ii) common groups found in both kingdoms but the utilized sequence is differentiated (REG, Inr and Kozak) and (iii) specific groups found in either plants or mammals (CpG and Y Patch). Our comparative LDSS analysis has identified conservation and differentiation of promoter architectures between higher plants and mammals.

[1]  R. Dickerson,et al.  How proteins recognize the TATA box. , 1996, Journal of molecular biology.

[2]  Michael Q. Zhang,et al.  Large-scale human promoter mapping using CpG islands , 2000, Nature Genetics.

[3]  A. Bird DNA methylation patterns and epigenetic memory. , 2002, Genes & development.

[4]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[5]  Kenta Nakai,et al.  DBTSS: database of transcription start sites, progress report 2008 , 2007, Nucleic Acids Res..

[6]  A. Sharrocks The ETS-domain transcription factor family , 2001, Nature Reviews Molecular Cell Biology.

[7]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[8]  G. Church,et al.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation , 1998, Nature Biotechnology.

[9]  J. T. Kadonaga,et al.  The RNA polymerase II core promoter. , 2003, Annual review of biochemistry.

[10]  J. Collado-Vides,et al.  Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. , 1998, Journal of molecular biology.

[11]  D. Landsman,et al.  Statistical analysis of over-represented words in human promoter sequences. , 2004, Nucleic acids research.

[12]  K. Akiyama,et al.  Functional Annotation of a Full-Length Arabidopsis cDNA Collection , 2002, Science.

[13]  Naum I Gershenzon,et al.  The features of Drosophila core promoters revealed by statistical analysis , 2006, BMC Genomics.

[14]  R. R. Samaha,et al.  Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. , 2000, Science.

[15]  Jun Kawai,et al.  Heterotachy in Mammalian Promoter Evolution , 2006, PLoS genetics.

[16]  G. Church,et al.  Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. , 2000, Journal of molecular biology.

[17]  R. Scarpulla Transcriptional activators and coactivators in the nuclear control of mitochondrial function in mammalian cells. , 2002, Gene.

[18]  T. Sakurai,et al.  Identification of plant promoter constituents by analysis of local distribution of short sequences , 2007, BMC Genomics.

[19]  Kenta Nakai,et al.  DBTSS: DataBase of Human Transcription Start Sites, progress report 2006 , 2005, Nucleic Acids Res..

[20]  Masaru Tomita,et al.  GC-compositional strand bias around transcription start sites in plants and fungi , 2005, BMC Genomics.

[21]  Rakesh Tuli,et al.  The TATA-Box Sequence in the Basal Promoter Contributes to Determining Light-Dependent Gene Expression in Plants1[W] , 2006, Plant Physiology.

[22]  R. Myers,et al.  Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome. , 2005, Genome research.

[23]  T. Steitz,et al.  Sequence-specific recognition of DNA by zinc-finger peptides derived from the transcription factor Sp1. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[24]  A. Bird,et al.  Enhanced CpG Mutability and Tumorigenesis in MBD4-Deficient Mice , 2002, Science.

[25]  Naoto Endo,et al.  Disruption of a long-range cis-acting regulator for Shh causes preaxial polydactyly , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[27]  J. T. Kadonaga,et al.  The RNA polymerase II core promoter: a key component in the regulation of gene expression. , 2002, Genes & development.

[28]  Cameron S. Osborne,et al.  Long-range chromatin regulatory interactions in vivo , 2002, Nature Genetics.

[29]  C. Vinson,et al.  Clustering of DNA sequences in human promoters. , 2004, Genome research.

[30]  R. Sharan,et al.  Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells. , 2003, Genome research.

[31]  A. Bird,et al.  Methylation-Induced Repression— Belts, Braces, and Chromatin , 1999, Cell.

[32]  Charles Elkan,et al.  The Value of Prior Knowledge in Discovering Motifs with MEME , 1995, ISMB.

[33]  Sin Lam Tan,et al.  Promoter prediction analysis on the whole human genome , 2004, Nature Biotechnology.

[34]  R. Mantovani,et al.  The molecular biology of the CCAAT-binding factor NF-Y. , 1999, Gene.

[35]  F. Robert,et al.  Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression , 2006 .

[36]  G. Church,et al.  Predicting regulons and their cis-regulatory motifs by comparative genomics. , 2000, Nucleic acids research.

[37]  Martin S. Taylor,et al.  Genome-wide analysis of mammalian promoter architecture and evolution , 2006, Nature Genetics.

[38]  Wanlei Zhou,et al.  Frequency distribution of TATA Box and extension sequences on human promoters , 2006, First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS'06).

[39]  E. Grotewold,et al.  Genome wide analysis of Arabidopsis core promoters , 2005, BMC Genomics.

[40]  Martin Tompa,et al.  Discovery of regulatory elements in vertebrates through comparative genomics , 2005, Nature Biotechnology.

[41]  Nickolai Alexandrov,et al.  Skew in CG content near the transcription start site in Arabidopsis thaliana , 2003, ISMB.

[42]  A. Sharrocks,et al.  The ETS-domain transcription factor family. , 1997, Nature reviews. Molecular cell biology.

[43]  David Landsman,et al.  Alignments anchored on genomic landmarks can aid in the identification of regulatory elements , 2005, ISMB.

[44]  T. Tsunoda,et al.  Identification and characterization of the potential promoter regions of 1031 kinds of human genes. , 2001, Genome research.