Model for the distributions of k-mers in DNA sequences.
暂无分享,去创建一个
The evolutionary features based on the distributions of k-mers in the DNA sequences of various organisms are studied. The organisms are classified into three groups based on their evolutionary periods: (a) E. coli and T. pallidum (b) yeast, zebrafish, A. thaliana, and fruit fly, (c) mouse, chicken, and human. The distributions of 6-mers of these three groups are shown to be, respectively, (a) unimodal, (b) unimodal with peaks generally shifted to smaller frequencies of occurrence, (c) bimodal. To describe the bimodal feature of the k-mer distributions of group (c), a model based on the cytosine-guanine " CG" content of the DNA sequences is introduced and shown to provide reasonably good agreements.
[1] Liaofu Luo,et al. Shannon information in complete genomes , 2005, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..
[2] Liaofu Luo,et al. Evidence for growth of microbial genomes by short segmental duplications , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.
[3] Huimin Xie,et al. Visualization of K-tuple distribution in procaryote complete genomes and their randomized counterparts , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.