GENERALIZED BIRTHDAY PROBLEMS IN THE LARGE-DEVIATIONS REGIME

This paper considers generalized birthday problems, in which there are d classes of possible outcomes. A fraction fi of the N possible outcomes has probability αi/N, where $\sum_{i=1}^{d} f_{i} =\sum_{i=1}^{d} f_{i}\alpha_{i}=1$. Sampling k times (with replacements), the objective is to determine (or approximate) the probability that all outcomes are different, the so-called uniqueness probability (or: no-coincidence probability). Although it is trivial to explicitly characterize this probability for the case d=1, the situation with multiple classes is substantially harder to analyze. Parameterizing k≡ aN, it turns out that the uniqueness probability decays essentially exponentially in N, where the associated decay rate ζ follows from a variational problem. Only for small d this can be solved in closed form. Assuming αi is of the form 1+φiɛ, the decay rate ζ can be written as a power series in ɛ; we demonstrate how to compute the corresponding coefficients explicitly. Also, a logarithmically efficient simulation procedure is proposed. The paper concludes with a series of numerical experiments, showing that (i) the proposed simulation approach is fast and accurate, (ii) assuming all outcomes equally likely would lead to estimates for the uniqueness probability that can be orders of magnitude off, and (iii) the power-series based approximations work remarkably well.

[1]  Sandeep Juneja,et al.  Overlap Problems on the Circle , 2013, Advances in Applied Probability.

[2]  M. Mandjes,et al.  THE ANALYSIS OF SINGLETONS IN GENERALIZED BIRTHDAY PROBLEMS , 2012, Probability in the Engineering and Informational Sciences.

[3]  P. E. Kopp,et al.  Convergence in Incomplete Market Models , 2000 .

[4]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[5]  L. Holst On Birthday, Collectors', Occupancy and Other Classical Urn Problems , 1986 .

[6]  David A. Wagner,et al.  A Generalized Birthday Problem , 2002, CRYPTO.

[7]  Jim Pitman,et al.  Limit Distributions and Random Trees Derived from the Birthday Problem with Unequal Probabilities , 2000 .

[8]  N. Henze A poisson limit law for a generalized birthday problem , 1998 .

[9]  Peter W. Glynn,et al.  Stochastic Simulation: Algorithms and Analysis , 2007 .

[10]  Frank Proschan,et al.  Birthday problem with unlike probabilities , 1992 .

[11]  T. Nunnikhoven A Birthday Problem Solution for Nonuniform Birth Frequencies , 1992 .

[12]  M. Mandjes,et al.  A Probabilistic Perspective on Re-Identifiability , 2013 .

[13]  Shigeru Mase,et al.  Approximations to the birthday problem with unequal occurrence probabilities and their application to the surname problem in Japan , 1992, Annals of the Institute of Statistical Mathematics.

[14]  Mitchell H. Gail,et al.  A SOLUTION TO THE GENERALIZED BIRTHDAY PROBLEM WITH APPLICATION TO ALLOZYME SCREENING FOR CELL CULTURE CONTAMINATION , 1979 .

[15]  Charles M. Grinstead,et al.  Introduction to probability , 1999, Statistics for the Behavioural Sciences.

[16]  Frederick Mosteller,et al.  Methods for studying coincidences , 1989 .

[17]  P. Rust The Effect of Leap Years and Seasonal Trends on the Birthday Problem , 1976 .