Local Limit Properties for Pattern Statistics and Rational Models

AbstractMotivated by problems of pattern statistics, we study the limit distribution of the random variable counting the number of occurrences of the symbol a in a word of length n chosen at random in {a,b}*, according to a probability distribution defined via a rational formal series s with positive real coefficients. Our main result is a local limit theorem of Gaussian type for these statistics under the hypothesis that s is a power of a primitive series. This result is obtained by showing a general criterion for (Gaussian) local limit laws of sequences of integer random variables. To prove our result we also introduce and analyse a notion of symbol-periodicity for irreducible matrices, whose entries are polynomials over positive semirings; the properties we prove on this topic extend the classical Perron--Frobenius theory of non-negative real matrices. As a further application we obtain some asymptotic evaluations of the maximum coefficient of monomials of given size for rational series in two commutative variables.

[1]  Diego de Falco,et al.  Frequency of symbol occurrences in bicomponent stochastic models , 2004, Theor. Comput. Sci..

[2]  Jean Berstel,et al.  Rational series and their languages , 1988, EATCS monographs on theoretical computer science.

[3]  L. M. M.-T. Theory of Probability , 1929, Nature.

[4]  Klaus Wich Sublinear Ambiguity , 2000, MFCS.

[5]  Philippe Flajolet,et al.  The Average Case Analysis of Algorithms : Multivariate Asymptotics and Limit Distributions , 1997 .

[6]  Philippe Flajolet,et al.  Motif Statistics , 1999, ESA.

[7]  Klaus Wich Exponential ambiguity of context-free grammars , 1999, Developments in Language Theory.

[8]  Ronald N. Bracewell,et al.  The Fourier Transform and Its Applications , 1966 .

[9]  Alberto Bertoni,et al.  On the number of occurrences of a symbol in words of regular languages , 2003, Theor. Comput. Sci..

[10]  Bernard Prum,et al.  Finding words with unexpected frequencies in deoxyribonucleic acid sequences , 1995 .

[11]  Leonidas J. Guibas,et al.  String Overlaps, Pattern Matching, and Nontransitive Games , 1981, J. Comb. Theory A.

[12]  Edward A. Bender,et al.  Central and Local Limit Theorems Applied to Asymptotic Enumeration , 1973, J. Comb. Theory A.

[13]  Arto Salomaa,et al.  Automata-Theoretic Aspects of Formal Power Series , 1978, Texts and Monographs in Computer Science.

[14]  Helmut Seidl,et al.  On the Degree of Ambiguity of Finite Automata , 1986, MFCS.

[15]  Andreas Weber,et al.  On the valuedness of finite transducers , 1990, Acta Informatica.

[16]  Mireille Régnier,et al.  On Pattern Frequency Occurrences in a Markovian Sequence , 1998, Algorithmica.

[17]  J. Sakarovitch Eléments de théorie des automates , 2003 .

[18]  Marcel Paul Schützenberger,et al.  Finite Counting Automata , 1962, Inf. Control..

[19]  Craig A. Stewart,et al.  Introduction to computational biology , 2005 .

[20]  Hsien-Kuei Hwang,et al.  Large deviations for combinatorial distributions. I. Central limit theorems , 1996 .

[21]  Arto Salomaa,et al.  Semirings, Automata, Languages , 1985, EATCS Monographs on Theoretical Computer Science.

[22]  Valerie Isham,et al.  Non‐Negative Matrices and Markov Chains , 1983 .

[23]  Hsien-Kuei Hwang,et al.  LARGE DEVIATIONS OF COMBINATORIAL DISTRIBUTIONS II. LOCAL LIMIT THEOREMS , 1998 .