Evolutionary hierarchies of conserved blocks in 5'-noncoding sequences of dicot rbcS genes

BackgroundEvolutionary processes in gene regulatory regions are major determinants of organismal evolution, but exceptionally challenging to study. We explored the possibilities of evolutionary analysis of phylogenetic footprints in 5'-noncoding sequences (NCS) from 27 ribulose-1,5-bisphosphate carboxylase small subunit (rbcS) genes, from three dicot families (Brassicaceae, Fabaceae and Solanaceae).ResultsSequences of up to 400 bp encompassing proximal promoter and 5'-untranslated regions were analyzed. We conducted phylogenetic footprinting by several alternative methods: generalized Lempel-Ziv complexity (CLZ), multiple alignments with DIALIGN and ALIGN-M, and the MOTIF SAMPLER Gibbs sampling algorithm. These tools collectively defined 36 conserved blocks of mean length 12.8 bp. On average, 12.5 blocks were found in each 5'-NCS. The blocks occurred in arrays whose relative order was absolutely conserved, confirming the existence of 'conserved modular arrays' in promoters. Identities of half of the blocks confirmed past rbcS research, including versions of the I-box, G-box, and GT-1 sites such as Box II. Over 90% of blocks overlapped DNase-protected regions in tomato 5'-NCS. Regions characterized by low CLZin sliding-window analyses were also frequently associated with DNase-protection. Blocks could be assigned to evolutionary hierarchies based on taxonomic distribution and estimated age. Lineage divergence dates implied that 13 blocks found in all three plant families were of Cretaceous antiquity, while other family-specific blocks were much younger. Blocks were also dated by formation of multigene families, using genome and coding sequence information. Dendrograms of evolutionary relations of the 5'-NCS were produced by several methods, including: cluster analysis using pairwise CLZvalues; evolutionary trees of DIALIGN sequence alignments; and cladistic analysis of conserved blocks.ConclusionDicot 5'-NCS contain conserved modular arrays of recurrent sequence blocks, which are coincident with functional elements. These blocks are amenable to evolutionary interpretation as hierarchies in which ancient, taxonomically widespread blocks can be distinguished from more recent, taxon-specific ones.

[1]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[2]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[3]  G. Coruzzi,et al.  Expression dynamics of the pea rbcS multigene family and organ distribution of the transcripts , 1986, The EMBO journal.

[4]  S. Kay,et al.  Characterization of a gene encoding a DNA binding protein with specificity for a light-responsive element. , 1992, The Plant cell.

[5]  M. Wojciechowski,et al.  Evolutionary rates analysis of Leguminosae implicates a rapid diversification of lineages during the tertiary. , 2005, Systematic biology.

[6]  G. Jenkins,et al.  A sucrose repression element in the Phaseolus vulgaris rbcS2 gene promoter resembles elements responsible for sugar stimulation of plant and mammalian genes , 1997, Plant Molecular Biology.

[7]  N. Chua,et al.  Dissection of 5′ upstream sequences for selective expression of the Nicotiana plumbaginifolia rbcS-8B gene , 1988, Molecular and General Genetics MGG.

[8]  E. Tobin,et al.  A light-regulated DNA-binding activity interacts with a conserved region of a Lemna gibba rbcS promoter. , 1990, The Plant cell.

[9]  W. Gruissem,et al.  Developmental and organ-specific changes in DNA-protein interactions in the tomato rbcS3B and rbcS3C promoter regions , 2004, Plant Molecular Biology.

[10]  X. Deng,et al.  Arabidopsis bZIP Protein HY5 Directly Interacts with Light-Responsive Promoters in Mediating Light Control of Gene Expression , 1998, Plant Cell.

[11]  C. Dean,et al.  Differential expression of the eight genes of the petunia ribulose bisphosphate carboxylase small subunit multi‐gene family , 1985, The EMBO journal.

[12]  P. Benfey,et al.  Erratum: Using cauliflower to find conserved non-coding regions in Arabidopsis (Plant Physiology (2002) 129 (451-454)) , 2002 .

[13]  C. Dean,et al.  Confirmation of the relative expression levels of the Petunia (Mitchell) rbcS genes. , 1987, Nucleic acids research.

[14]  W. Gruissem,et al.  Developmental and organ-specific changes in DNA-protein interactions in the tomato rbcS1, rbcS2 and rbcS3A promoter regions , 2004, Plant Molecular Biology.

[15]  N. Chua,et al.  Calcium and cGMP target distinct phytochrome-responsive elements. , 1996, The Plant journal : for cell and molecular biology.

[16]  Wen-Hsiung Li Unbiased estimation of the rates of synonymous and nonsynonymous substitution , 2006, Journal of Molecular Evolution.

[17]  E. Pichersky,et al.  An evolutionarily conserved protein binding sequence upstream of a plant light-regulated gene. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Yoshihiro Ugawa,et al.  Plant cis-acting regulatory DNA elements (PLACE) database: 1999 , 1999, Nucleic Acids Res..

[19]  A. Force,et al.  Preservation of duplicate genes by complementary, degenerative mutations. , 1999, Genetics.

[20]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[21]  P. Gilmartin,et al.  Arabidopsis thaliana GATA factors: organisation, expression and DNA-binding characteristics , 2002, Plant Molecular Biology.

[22]  M. A. Koch,et al.  Comparative evolutionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassicaceae). , 2000, Molecular biology and evolution.

[23]  Lode Wyns,et al.  Align-m-a new algorithm for multiple alignment of highly divergent sequences , 2004, Bioinform..

[24]  E. Lam,et al.  A metal-dependent DNA-binding protein interacts with a constitutive element of a light-responsive promoter. , 1990, The Plant cell.

[25]  S. Shiu,et al.  Expansion of the Receptor-Like Kinase/Pelle Gene Family and Receptor-Like Proteins in Arabidopsis1[w] , 2003, Plant Physiology.

[26]  D. Cooper,et al.  Meta‐analysis of indels causing human genetic disease: mechanisms of mutagenesis and the role of local DNA sequence complexity , 2003, Human mutation.

[27]  C. Dean,et al.  Structure, evolution and regulation of RbcS genes in higher plants , 1989 .

[28]  C. Kuhlemeier,et al.  Localization and conditional redundancy of regulatory elements in rbcS-3A, a pea gene encoding the small subunit of ribulose-bisphosphate carboxylase. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[29]  W. Gruissem,et al.  Organization and expression of the genes encoding ribulose-1,5-bisphosphate carboxylase in higher plants , 1988, Photosynthesis Research.

[30]  Matthew W. Hahn,et al.  The evolution of transcriptional regulation in eukaryotes. , 2003, Molecular biology and evolution.

[31]  N. Chua,et al.  Binding sites for two novel phosphoproteins, 3AF5 and 3AF3, are required for rbcS-3A expression. , 1992, The Plant cell.

[32]  S. Moose,et al.  Conserved Noncoding Sequences among Cultivated Cereal Genomes Identify Candidate Regulatory Sequence Elements and Patterns of Promoter Evolution Online version contains Web-only data. Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.010181. , 2003, The Plant Cell Online.

[33]  M. Källersjö,et al.  Phylogenetics of asterids based on 3 coding and 3 non-coding chloroplast DNA markers and the utility of non-coding DNA at higher taxonomic levels. , 2002, Molecular phylogenetics and evolution.

[34]  A. Cashmore,et al.  Mutation of either G box or I box sequences profoundly affects expression from the Arabidopsis rbcS‐1A promoter. , 1990, The EMBO journal.

[35]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[36]  Kathleen Marchal,et al.  A novel approach to identifying regulatory motifs in distantly related genomes , 2005, Genome Biology.

[37]  Yasuko Takahashi,et al.  Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events , 2022 .

[38]  S. Kelchner The Evolution of Non-Coding Chloroplast DNA and Its Application in Plant Systematics , 2000 .

[39]  W. Gruissem,et al.  Developmental and organ-specific changes in promoter DNA-protein interactions in the tomato rbcS gene family. , 1991, The Plant cell.

[40]  C. Pipper,et al.  [''R"--project for statistical computing]. , 2008, Ugeskrift for laeger.

[41]  A G Clark,et al.  The search for meaning in noncoding DNA. , 2001, Genome research.

[42]  Vladimir D. Gusev,et al.  On the complexity measures of genetic sequences , 1999, Bioinform..

[43]  P. Benfey,et al.  Using Cauliflower to Find Conserved Non-Coding Regions in Arabidopsis1 , 2002, Plant Physiology.

[44]  E. Pichersky,et al.  Level of expression of the tomato rbcS-3A gene is modulated by a far upstream promoter element in a developmentally regulated manner. , 1989, The Plant cell.

[45]  Sergio Verdú,et al.  Fifty Years of Shannon Theory , 1998, IEEE Trans. Inf. Theory.

[46]  Burkhard Morgenstern,et al.  DIALIGN: finding local similarities by multiple sequence alignment , 1998, Bioinform..

[47]  J. Stone,et al.  Rapid evolution of cis-regulatory sequences via local point mutations. , 2001, Molecular biology and evolution.

[48]  Abraham Lempel,et al.  On the Complexity of Finite Sequences , 1976, IEEE Trans. Inf. Theory.

[49]  K. Stüber,et al.  Discrimination of phytochrome dependent light inducible from non-light inducible plant genes. Prediction of a common light-responsive element (LRE) in phytochrome dependent light inducible plant genes. , 1987, Nucleic Acids Research.

[50]  A. Clark,et al.  Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover. , 2002, Molecular biology and evolution.

[51]  U. Borello,et al.  Constitutive, light-responsive and circadian clock-responsive factors compete for the different l box elements in plant light-regulated promoters. , 1993, The Plant journal : for cell and molecular biology.

[52]  Luis Herrera-Estrella,et al.  EVOLUTION OF LIGHT-REGULATED PLANT PROMOTERS. , 1998, Annual review of plant physiology and plant molecular biology.

[53]  D. Cooper,et al.  Promoter shuffling has occurred during the evolution of the vertebrate growth hormone gene. , 2000, Gene.

[54]  Khalid Sayood,et al.  A new sequence distance measure for phylogenetic tree construction , 2003, Bioinform..

[55]  W. Gruissem,et al.  Organ-Specific Differential Regulation of a Promoter Subfamily for the Ribulose-1,5-Bisphosphate Carboxylase/Oxygenase Small Subunit Genes in Tomato , 1995, Plant physiology.

[56]  E. Krebbers,et al.  Four genes in two diverged subfamilies encode the ribulose-1,5-bisphosphate carboxylase small subunit polypeptides of Arabidopsis thaliana , 1988, Plant Molecular Biology.

[57]  Hidetoshi Shimodaira An approximately unbiased test of phylogenetic tree selection. , 2002, Systematic biology.

[58]  L. Herrera-Estrella,et al.  Ancestral Multipartite Units in Light-Responsive Plant Promoters Have Structural Features Correlating with Specific Phototransduction Pathways , 1996, Plant physiology.

[59]  C. Dean,et al.  Molecular characterization of the rbcS multi-gene family of Petunia (Mitchell) , 1987, Molecular and General Genetics MGG.

[60]  Yuriy L. Orlov,et al.  Complexity: an internet resource for analysis of DNA sequence complexity , 2004, Nucleic Acids Res..

[61]  S. Wessler,et al.  Stowaway: a new family of inverted repeat elements associated with the genes of both monocotyledonous and dicotyledonous plants. , 1994, The Plant cell.

[62]  Jean-Paul Delahaye,et al.  Transformation distances: a family of dissimilarity measures based on movements of segments , 1999, Bioinform..

[63]  Hilla Peretz,et al.  Ju n 20 03 Schrödinger ’ s Cat : The rules of engagement , 2003 .

[64]  Luis Herrera-Estrella,et al.  Functional Properties and Regulatory Complexity of a MinimalRBCS Light-Responsive Unit Activated by Phytochrome, Cryptochrome, and Plastid Signals1 , 2002, Plant Physiology.

[65]  B. Gaut,et al.  Does recombination shape the distribution and evolution of tandemly arrayed genes (TAGs) in the Arabidopsis thaliana genome? , 2003, Genome research.

[66]  K. Hokamp,et al.  A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. , 2003, Genome research.

[67]  Kathleen Marchal,et al.  A Gibbs sampling method to detect over-represented motifs in the upstream regions of co-expressed genes , 2001, RECOMB.

[68]  D. Cooper,et al.  THE EVOLUTION OF THE VERTEBRATE β-GLOBIN GENE PROMOTER , 2002, Evolution; international journal of organic evolution.

[69]  Kathleen Marchal,et al.  A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling , 2001, Bioinform..

[70]  Michael P. Cummings,et al.  PAUP* [Phylogenetic Analysis Using Parsimony (and Other Methods)] , 2004 .

[71]  S. Goff,et al.  Utility and distribution of conserved noncoding sequences in the grasses , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[72]  B. Blume,et al.  Identification of transposon-like elements in non-coding regions of tomato ACC oxidase genes , 1997, Molecular and General Genetics MGG.

[73]  Matthias Platzer,et al.  Breakpoint analysis of the pericentric inversion distinguishing human chromosome 4 from the homologous chromosome in the chimpanzee (Pan troglodytes) , 2005, Human mutation.

[74]  Ø. Hammer,et al.  PAST: PALEONTOLOGICAL STATISTICAL SOFTWARE PACKAGE FOR EDUCATION AND DATA ANALYSIS , 2001 .

[75]  S. Karlin,et al.  DNA sequence comparisons of the human, mouse, and rabbit immunoglobulin kappa gene. , 1985, Molecular biology and evolution.

[76]  P. Schreier,et al.  The gene family encoding the ribulose-(1,5)-bisphosphate carboxylase/oxygenase (Rubisco) small subunit of potato. , 1993, Gene.

[77]  Mark W. Chase,et al.  Evolution of the angiosperms: calibrating the family tree , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[78]  M. P. Cummings PHYLIP (Phylogeny Inference Package) , 2004 .

[79]  W. Gruissem,et al.  Developmental, organ-specific, and light-dependent expression of the tomato ribulose-1,5-bisphosphate carboxylase small subunit gene family. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[80]  Guillaume Blanc,et al.  Widespread Paleopolyploidy in Model Plant Species Inferred from Age Distributions of Duplicate Genes , 2004, The Plant Cell Online.

[81]  M Sammeth,et al.  QAlign: quality-based multiple alignments with dynamic phylogenetic analysis. , 2003, Bioinformatics.

[82]  N. Chua,et al.  Molecular light switches for plant genes. , 1990, The Plant cell.

[83]  A. Estoup,et al.  Evidence of Gene Conversion Events Between Paralogous Sequences Produced by Tetraploidization in Salmoninae Fish , 2002, Journal of Molecular Evolution.

[84]  M A Koch,et al.  Comparative genomics and regulatory evolution: conservation and function of the Chs and Apetala3 promoters. , 2001, Molecular biology and evolution.

[85]  Bin Li,et al.  Limitations and potentials of current motif discovery algorithms , 2005, Nucleic acids research.

[86]  R. Hill,et al.  Integration of morphological data sets for phylogenetic analysis of Amniota: the importance of integumentary characters and increased taxonomic sampling. , 2005, Systematic biology.

[87]  Burkhard Morgenstern,et al.  DIALIGN2: Improvement of the segment to segment approach to multiple sequence alignment , 1999, German Conference on Bioinformatics.

[88]  A. Cashmore,et al.  Binding of a pea nuclear protein to promoters of certain photoregulated genes is modulated by phosphorylation. , 1989, The Plant cell.

[89]  Y. van de Peer,et al.  Promoter analysis of MADS-box genes in eudicots through phylogenetic footprinting. , 2006, Molecular biology and evolution.

[90]  P. Gantet,et al.  Plant bZIP G-box binding factors. Modular structure and activation mechanisms. , 2001, European journal of biochemistry.

[91]  David N Cooper,et al.  Breakpoints of gross deletions coincide with non-B DNA conformations. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[92]  Xin Chen,et al.  An information-based sequence distance and its application to whole mitochondrial genome phylogeny , 2001, Bioinform..

[93]  G. Link,et al.  5′-upstream cis-elements and binding factor(s) potentially involved in light-regulated expression of a Brassica napus rbcS gene , 1992, Current Genetics.

[94]  O Hammer-Muntz,et al.  PAST: paleontological statistics software package for education and data analysis version 2.09 , 2001 .