Non-redundant random generation algorithms for weighted context-free languages

We address the non-redundant random generation of $k$ words of length $n$ in a context-free language. Additionally, we want to avoid a predefined set of words. We study a rejection-based approach, whose worst-case time complexity is shown to grow exponentially with $k$ for some specifications and in the limit case of a coupon collector. We propose two algorithms respectively based on the recursive method and on an unranking approach. We show how careful implementations of these algorithms allow for a non-redundant generation of $k$ words of length $n$ in $\mathcal{O}(k\cdot n\cdot \log{n})$ arithmetic operations, after a precomputation of $\Theta(n)$ numbers. The overall complexity is therefore dominated by the generation of $k$ words, and the non-redundancy comes at a negligible cost.

[1]  A. Denise,et al.  Random generation of words of context-free languages according to the frequencies of letters , 2000 .

[2]  Yann Ponty,et al.  Controlled non uniform random generation of decomposable structures , 2010, Theor. Comput. Sci..

[3]  Frank Weinberg,et al.  Non Uniform Generation of Combinatorial Objects , 2010 .

[4]  Arjen K. Lenstra Birthday Paradox , 2011, Encyclopedia of Cryptography and Security.

[5]  Bruno Salvy,et al.  GFUN: a Maple package for the manipulation of generating and holonomic functions in one variable , 1994, TOMS.

[6]  P. Flajolet,et al.  Boltzmann Sampling of Unlabelled Structures , 2006 .

[7]  S. Lalley Finite Range Random Walk on Free Groups and Homogeneous Trees , 1993 .

[8]  Philippe Flajolet,et al.  On Buffon machines and numbers , 2009, SODA '11.

[9]  M. Drmota Systems of functional equations , 1997 .

[10]  P. Zimmermann Uniform random generation for the powerset construction , 1995 .

[11]  M. AdelsonVelskii,et al.  AN ALGORITHM FOR THE ORGANIZATION OF INFORMATION , 1963 .

[12]  Alain Denise,et al.  Uniform random sampling of traces in very large models , 2006, RT '06.

[13]  Frédérique Bassino,et al.  On the Average Complexity of Moore's State Minimization Algorithm , 2009, STACS.

[14]  C. Lawrence,et al.  A statistical sampling algorithm for RNA secondary structure prediction. , 2003, Nucleic acids research.

[15]  Massimiliano Goldwurm,et al.  Random Generation of Words in an Algebraic Language in Linear Binary Space , 1995, Inf. Process. Lett..

[16]  Guy Louchard,et al.  Boltzmann Samplers for the Random Generation of Combinatorial Structures , 2004, Combinatorics, Probability and Computing.

[17]  Yann Ponty,et al.  Multi-dimensional Boltzmann Sampling of Languages , 2010, 1002.0046.

[18]  Philippe Flajolet,et al.  Birthday Paradox, Coupon Collectors, Caching Algorithms and Self-Organizing Search , 1992, Discret. Appl. Math..

[19]  Yann Ponty,et al.  GenRGenS: software for generating random genomic sequences and structures , 2006, Bioinform..

[20]  Philippe Flajolet,et al.  A Calculus for the Random Generation of Labelled Combinatorial Structures , 1994, Theor. Comput. Sci..

[21]  Alan R. Woods Coloring rules for finite trees, and probabilities of monadic second order sentences , 1997, Random Struct. Algorithms.

[22]  Alain Denise,et al.  Uniform Random Generation of Decomposable Structures Using Floating-Point Arithmetic , 1999, Theor. Comput. Sci..

[23]  Alan R. Woods Coloring rules for finite trees, and probabilities of monadic second order sentences , 1997 .

[24]  H. Wilf A unified setting for sequencing, ranking, and selection algorithms for combinatorial objects , 1977 .

[25]  Conrado Martínez,et al.  A generic approach for the unranking of labeled combinatorial classes , 2001, Random Struct. Algorithms.

[26]  Joris van der Hoeven,et al.  Relax, but Don't be Too Lazy , 2002, J. Symb. Comput..

[27]  Srecko Brlek,et al.  Non uniform random generation of generalized Motzkin paths , 2006, Acta Informatica.