Revealing gene transcription and translation initiation patterns in archaea, using an interactive clustering model

An interactive clustering model based on positional weight matrices is described and results obtained using the model to analyze gene regulation patterns in archaea are presented. The 5′ flanking sequences of ORFs identified in four archaea, Sulfolobus solfataricus, Pyrobaculum aerophilum, Halobacterium sp. NRC-1, and Pyrococcus abyssi, were clustered using the model. Three regular patterns of clusters were identified for most ORFs. One showed genes with only a ribosome-binding site; another showed genes with a transcriptional regulatory region located at a constant location with respect to the start codon. A third pattern combined the previous two. Both P. aerophilum and Halobacterium sp. NRC-1 exhibited clusters of genes that lacked any regular pattern. Halobacterium sp. NRC-1 also presented regular features not seen in the other organisms. This group of archaea seems to use a combination of eubacterial and eukaryotic regulatory features as well as some unique to individual species. Our results suggest that interactive clustering may be used to examine the divergence of the gene regulatory machinery in archaea and to identify the presence of archaea-specific gene regulation patterns.

[1]  M. Kozak Initiation of translation in prokaryotes and eukaryotes. , 1999, Gene.

[2]  Mark A. Ragan,et al.  The complete genome of the crenarchaeon Sulfolobus solfataricus P2 , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[3]  J. Soppa Normalized nucleotide frequencies allow the definition of archaeal promoter elements for different archaeal groups and reveal base‐specific TFB contacts upstream of the TATA box , 1999, Molecular microbiology.

[4]  Julio Collado-Vides,et al.  RegulonDB (version 3.2): transcriptional regulation and operon organization in Escherichia coli K-12 , 2001, Nucleic Acids Res..

[5]  S. S. Cairns,et al.  Transcriptional regulation of an archaeal operon in vivo and in vitro. , 1999, Molecular cell.

[6]  J. Soppa,et al.  Transcription initiation in Archaea: facts, factors and future aspects , 1999, Molecular microbiology.

[7]  D V Holberton,et al.  Analysis of consensus sequence patterns in Giardia cytoskeleton gene promoters. , 1995, Nucleic acids research.

[8]  M. Borodovsky,et al.  Leaderless transcripts of the crenarchaeal hyperthermophile Pyrobaculum aerophilum. , 2001, Journal of molecular biology.

[9]  M. Tomita,et al.  Computer analyses of complete genomes suggest that some archaebacteria employ both eukaryotic and eubacterial mechanisms in translation initiation. , 1999, Gene.

[10]  E. Geiduschek,et al.  Activation of archaeal transcription by recruitment of the TATA-binding protein , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[11]  H. Margalit,et al.  Identification and characterization of E.coli ribosomal binding sites by free energy computation. , 1993, Nucleic acids research.

[12]  Roger E Bumgarner,et al.  Snapshot of a large dynamic replicon in a halophilic archaeon: megaplasmid or minichromosome? , 1998, Genome research.

[13]  C. Ouzounis,et al.  The eubacterial transcriptional activator Lrp is present in the archaeon Pyrococcus furiosus. , 1995, Trends in biochemical sciences.

[14]  V. Thorsson,et al.  Genome sequence of Halobacterium species NRC-1. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Susan M. Bridges,et al.  Interactive clustering for exploration of genomic data , 2002 .

[16]  Rodger Staden,et al.  Measurements of the effects that coding for a protein has on a DNA sequence and their use for finding genes , 1984, Nucleic Acids Res..

[17]  Sridhar Hannenhalli,et al.  Enrichment of regulatory signals in conserved non-coding genomic sequence , 2001, Bioinform..

[18]  N. Pace,et al.  Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[19]  P. Bucher Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. , 1990, Journal of molecular biology.

[20]  S. Bell,et al.  Transcription in Archaea. , 1998, Cold Spring Harbor symposia on quantitative biology.

[21]  Michael Y. Galperin,et al.  Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs) , 2000, Genome Biology.

[22]  T. Shioda,et al.  Nucleic Acids Research Nucleotide sequence of the bovine paralnfhienza 3 virus genome : the genes of the F and HN glycoproteins , 2005 .

[23]  S. Karlin,et al.  Correlations between Shine-Dalgarno Sequences and Gene Features Such as Predicted Expression Levels and Operon Structures , 2002, Journal of bacteriology.

[24]  C. Sensen,et al.  Two different and highly organized mechanisms of translation initiation in the archaeon Sulfolobus solfataricus , 2000, Extremophiles.

[25]  Melvin I Simon,et al.  Genome sequence of the hyperthermophilic crenarchaeon Pyrobaculum aerophilum , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[26]  E. Delong,et al.  High abundance of Archaea in Antarctic marine picoplankton , 1994, Nature.

[27]  Temple F. Smith,et al.  Operons in Escherichia coli: genomic analyses and predictions. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[29]  M. Thomm,et al.  A Pyrococcus homolog of the leucine-responsive regulatory protein, LrpA, inhibits transcription by abrogating RNA polymerase recruitment. , 2002, Nucleic acids research.

[30]  C. Woese,et al.  Phylogenetic structure of the prokaryotic domain: The primary kingdoms , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[31]  M. Thomm,et al.  A Novel Archaeal Transcriptional Regulator of Heat Shock Response* , 2003, The Journal of Biological Chemistry.

[32]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[33]  Qiuhao Qu,et al.  TrmB, a Sugar-specific Transcriptional Regulator of the Trehalose/Maltose ABC Transporter from the Hyperthermophilic Archaeon Thermococcus litoralis * , 2003, The Journal of Biological Chemistry.

[34]  David J. States,et al.  Conformational model for binding site recognition by the E.coli MetJ transcription factor , 2001, Bioinform..