论文信息 - Searching for Multiple Words in a Markov Sequence

Searching for Multiple Words in a Markov Sequence

The theory of the discrete-time Markovian arrival process (DMAP) can be applied to some statistical problems encountered when searching for multiple words in a Markov sequence. Such word searches are often emphasized in studies of the human genome. There are several advantages to the DMAP approach we present. Most notably, its derivations are transparent, and they readily unify disparate results about the exact distributions of overlapping and nonoverlapping word counts. We also present several examples and applications of our theory, including a numerical study using a random DNA dataset from the human genome.

John L. Spouge | Yonil Park

[1] Richard Arratia,et al. Central Limit Theorem from Renewal Theory for Several Patterns , 1997, J. Comput. Biol..

[2] Stéphane Robin,et al. Numerical Comparison of Several Approximations of the Word Count Distribution in Random Sequences , 2002, J. Comput. Biol..

[3] J. D. Biggins,et al. Markov renewal processes, counters and repeated sequences in Markov chains , 1987, Advances in Applied Probability.

[4] Michael S. Waterman,et al. Introduction to computational biology , 1995 .

[5] M H Skolnick,et al. A model for restriction fragment length distributions. , 1983, American journal of human genetics.

[6] W. Y. Wendy Lou,et al. Distribution Theory of Runs and Patterns and Its Applications: A Finite Markov Chain Imbedding Approach , 2003 .

[7] Jean-Jacques Daudin,et al. Exact distribution of word occurrences in a random sequence of letters , 1999, Journal of Applied Probability.

[8] Gesine Reinert,et al. Compound Poisson and Poisson Process Approximations for Occurrences of Multiple Words in Markov Chains , 1998, J. Comput. Biol..

[9] Michael S. Waterman,et al. Renewal theory for several patterns , 1985 .

[10] Mikhail S. Gelfand,et al. Extendable words in nucleotide sequences , 1992, Comput. Appl. Biosci..

[11] Markos V. Koutras,et al. Distribution Theory of Runs: A Markov Chain Approach , 1994 .