On the Analysis of Linear Probing Hashing

Abstract. This paper presents moment analyses and characterizations of limit distributions for the construction cost of hash tables under the linear probing strategy. Two models are considered, that of full tables and that of sparse tables with a fixed filling ratio strictly smaller than one. For full tables, the construction cost has expectation O(n3/2) , the standard deviation is of the same order, and a limit law of the Airy type holds. (The Airy distribution is a semiclassical distribution that is defined in terms of the usual Airy functions or equivalently in terms of Bessel functions of indices $ -\frac{1}{3},\frac{2}{3} $ .) For sparse tables, the construction cost has expectation O(n) , standard deviation O ( $ \sqrt{n} $ ), and a limit law of the Gaussian type. Combinatorial relations with other problems leading to Airy phenomena (like graph connectivity, tree inversions, tree path length, or area under excursions) are also briefly discussed.

[1]  S. O. Rice,et al.  The Integral of the Absolute Value of the Pinned Wiener Process-- Calculation of Its Probability Density by Numerical Integration , 1982 .

[2]  P. Flajolet,et al.  On Ramanujan's Q-function , 1995, Journal of Computational and Applied Mathematics.

[3]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[4]  Boris G. Pittel,et al.  Linear Probing: The Probable Largest Search Time Grows Logarithmically with the Number of Records , 1987, J. Algorithms.

[5]  Thomas Prellberg,et al.  Uniform q-series asymptotics for staircase polygons. , 1995 .

[6]  Donald E. Knuth,et al.  Activity in an Interleaved Memory , 1975, IEEE Transactions on Computers.

[7]  G. Louchard The brownian excursion area: a numerical analysis , 1984 .

[8]  D. Knuth,et al.  Mathematics for the Analysis of Algorithms , 1999 .

[9]  Donald E. Knuth Linear Probing and Graphs , 1998, Algorithmica.

[10]  G. Kreweras Une famille de polynômes ayant plusieurs propriétés énumeratives , 1980 .

[11]  Patricio V. Poblete,et al.  Approximating Functions by Their Poisson Transform , 1986, Inf. Process. Lett..

[12]  Edward M. Wright,et al.  The number of connected sparsely edged graphs. II. Smooth graphs and blocks , 1978, J. Graph Theory.

[13]  Lajos Takács,et al.  Conditional limit theorems for branching processes , 1991 .

[14]  Donald E. Knuth,et al.  The Art of Computer Programming, Vol. 3: Sorting and Searching , 1974 .

[15]  Joel Spencer ENUMERATING GRAPHS AND BROWNIAN MOTION , 1997 .

[16]  Philippe Flajolet,et al.  Random Mapping Statistics , 1990, EUROCRYPT.

[17]  Lajos Takács,et al.  A bernoulli excursion and its various applications , 1991, Advances in Applied Probability.

[18]  Danièle Gardy,et al.  Some results on the asymptotic behaviour of coefficients of large powers of functions , 1995, Discret. Math..

[19]  R. Haskins Mathematics for the analysis of algorithms (2nd ed.) , 1986, Proceedings of the IEEE.

[20]  C. Mallows,et al.  The inversion enumerator for labeled trees , 1968 .

[21]  Donald E. Knuth The Art of Computer Programming 2 / Seminumerical Algorithms , 1971 .

[22]  Lajos Takács,et al.  On a probability problem connected with Railway traffic , 1991 .

[23]  Yeong-Nan Yeh,et al.  Enumeration of trees by inversions , 1995, J. Graph Theory.

[24]  Donald E. Knuth,et al.  The art of computer programming. Vol.2: Seminumerical algorithms , 1981 .

[25]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[26]  Andrei Z. Broder,et al.  Two counting problems solved via string encodings , 1985 .

[27]  Philippe Flajolet,et al.  An introduction to the analysis of algorithms , 1995 .

[28]  Donald E. Knuth The art of computer programming: fundamental algorithms , 1969 .

[29]  Donald E. Knuth,et al.  An Analysis of Optimum Caching , 1985, J. Algorithms.

[30]  Donald E. Knuth,et al.  The Expected Linearity of a Simple Equivalence Algorithm , 1978, Theor. Comput. Sci..

[31]  Donald E. Knuth,et al.  The art of computer programming: V.1.: Fundamental algorithms , 1997 .

[32]  Donald E. Knuth,et al.  The first cycles in an evolving graph , 1989, Discret. Math..

[33]  Helmut Prodinger,et al.  Über Einige Funktionaldifferentialgleichungen Aus Der Analyse Von Algorithmen , 1987 .

[34]  Guy Louchard,et al.  KAC'S FORMULA, LEVY'S LOCAL TIME AND BROWNIAN EXCURSION , 1984 .

[35]  J. Moon Counting labelled trees , 1970 .

[36]  J. IAN MUNRO,et al.  The Analysis of Linear Probing Sort by the Use of a New Mathematical Transform , 1984, J. Algorithms.

[37]  N. D. Bruijn Asymptotic methods in analysis , 1958 .

[38]  Neil J. A. Sloane,et al.  The encyclopedia of integer sequences , 1995 .

[39]  G. H. Gonnet,et al.  Handbook of algorithms and data structures: in Pascal and C (2nd ed.) , 1991 .

[40]  P. Hennequin Combinatorial Analysis of Quicksort Algorithm , 1989, RAIRO Theor. Informatics Appl..

[41]  Edward M. Wright,et al.  The number of connected sparsely edged graphs , 1977, J. Graph Theory.

[42]  Ira M. Gessel,et al.  Depth-First Search as a Combinatorial Correspondence , 1979, J. Comb. Theory, Ser. A.

[43]  B. Berndt Ramanujan’s Notebooks: Part V , 1997 .

[44]  Philippe Flajolet,et al.  Singularity Analysis of Generating Functions , 1990, SIAM J. Discret. Math..

[45]  Svante Janson,et al.  The Birth of the Giant Component , 1993, Random Struct. Algorithms.

[46]  Edward M. Wright,et al.  The number of connected sparsely edged graphs. III. Asymptotic results , 1980, J. Graph Theory.

[47]  A. Odlyzko Asymptotic enumeration methods , 1996 .