Search for evolution-related-oligonucleotides and conservative words in rRNA sequences

We describe a method for finding unmapped conserved words in rRNA sequences that is effective, utilizes evolutionary information and does not depend on multiple sequence alignment. Evolutionary distance (called n-distance) between a pair of 16S or 18S rRNA sequences is defined in terms of the difference in the two sets of frequencies of occurrence of oligonucleotides n bases long (n-mers) given by the sequences. These n-distances are used to reconstruct phylogenetic trees for 35 representative organisms from all three kingdoms. The quality of the tree generally improves with increasing n and reaches a plateau of best fit at n=7 or 8. Hence the 7-mer or 8-mer (oligonucleotide of 7 or 8 bases) frequencies provide a basis to describe rRNA evolution. Based on the analysis of the contribution of a particular 7-mers to 7-distances, a set of 612 7-mers (called evolution-related-oligonucleotides, EROs) that are critical to the topology of the best phylogenetic tree are identified. Expanding from this set of EROs, evolution-related conservative words longer than 7 bases in 16S rRNA sequences from an enlarged set of 98 organisms in bacteria and archaea are identified based on two criteria: 1) the word is highly conserved in nearly all species of a kingdom (or a sub-kingdom); and 2) the word is located at nearly the same site in each sequence. Three examples of words thus found are: the 13-mer ggattagataccc located at the end of a loop near H24 (in E.coli) is conservative in almost all species in archaea and bacteria. The 8-mer aacgagcg located on H35 is also conservative in archaea and bacteria. Its expansion, the 32-mer tgttgggttaagtcccgcaacgagcgcaaccc, is conservative in bacteria but not in archaea.

[1]  R. Overbeek,et al.  The winds of (evolutionary) change: breathing new life into microbiology. , 1996, Journal of bacteriology.

[2]  R. Brimacombe,et al.  The structure of ribosomal RNA: a three-dimensional jigsaw puzzle. , 1995, European journal of biochemistry.

[3]  H. Noller Structure of ribosomal RNA. , 1984, Annual review of biochemistry.