论文信息 - A Probabilistic Approach To Identifying Consensus In Molecular Sequences

A Probabilistic Approach To Identifying Consensus In Molecular Sequences

Given a profile of nucleic acid bases at a specified position in an aligned set of molecular sequences, a simple rule for defining ambiguity codes is presented: all bases whose frequency in the profile falls below the maximum profile frequency by no more than a specified number d are included in the ambiguity code. Ways are described of defining d so as to ensure that this ‘containing subset’ possesses desirable properties under the assumption of a multinomial model for the frequencies of bases in the profile. The method is illustrated on two data sets, and a discussion is given of its characteristics in terms of some possible properties for consensus methods presented by Day and McMorris (1992a).

A. D. Gordon

[1] Fred R. McMorris,et al. A consensus program for molecular sequences , 1993, Comput. Appl. Biosci..

[2] W. H. Day,et al. Critical comparison of consensus methods for molecular sequences. , 1992, Nucleic acids research.

[3] T. D. Schneider,et al. Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[4] B. Vissel,et al. A SURVEY OF THE GENOMIC DISTRIBUTION OF ALPHA SATELLITE DNA ON ALL THE HUMAN CHROMOSOMES, AND DERIVATION OF A NEW CONSENSUS SEQUENCE , 1991 .

[5] M. Linsenmeyer,et al. Revised genomic consensus for the hypermethylated CpG island region of the human L1 transposon and integration sites of full length L1 elements from recombinant clones made using methylation-tolerant host strains. , 1991, Nucleic acids research.

[6] Han Liu,et al. On selecting a subset containing the most probable multinomial event , 1991 .

[7] W. H. Day,et al. Consensus sequences based on plurality rule. , 1992, Bulletin of mathematical biology.

[8] William H. E. Day,et al. An Empirical Evaluation of Consensus Rules for Molecular Sequences , 1994 .