Balanced Families of Perfect Hash Functions and Their Applications

The construction of perfect hash functions is a well-studied topic. In this paper, this concept is generalized with the following definition. We say that a family of functions from [n] to [k] is a δ-balanced (n, k)-family of perfect hash functions if for every S ⊆ [n], |S| = k, the number of functions that are 1-1 on S is between T/δ and δT for some constant T > 0. The standard definition of a family of perfect hash functions requires that there will be at least one function that is 1-1 on S, for each S of size k. In the new notion of balanced families, we require the number of 1-1 functions to be almost the same (taking δ to be close to 1) for every such S. Our main result is that for any constant δ > 1, a δ-balanced (n, k)-family of perfect hash functions of size 2O(k log log k) log n can be constructed in time 2O(k log log k)n log n. Using the technique of color-coding we can apply our explicit constructions to devise approximation algorithms for various counting problems in graphs. In particular, we exhibit a deterministic polynomial time algorithm for approximating both the number of simple paths of length k and the number of simple cycles of size k for any k ≤ O(log n/log log log n) in a graph with n vertices. The approximation is up to any fixed desirable relative error.

[1]  Raphael Yuster,et al.  Detecting short directed cycles using rectangular matrix multiplication and dynamic programming , 2004, SODA '04.

[2]  Jeanette P. Schmidt,et al.  The Spatial Complexity of Oblivious k-Probe Hash Functions , 2018, SIAM J. Comput..

[3]  Moni Naor,et al.  Small-Bias Probability Spaces: Efficient Constructions and Applications , 1993, SIAM J. Comput..

[4]  Raphael Yuster,et al.  Finding Even Cycles Even Faster , 1994, SIAM J. Discret. Math..

[5]  Roded Sharan,et al.  QPath: a method for querying pathways in a protein-protein interaction network , 2006, BMC Bioinformatics.

[6]  Jörg Flum,et al.  The Parameterized Complexity of Counting Problems , 2004, SIAM J. Comput..

[7]  Nimrod Megiddo,et al.  Constructing Small Sample Spaces Satisfying Given Constraints , 1994, SIAM J. Discret. Math..

[8]  Roded Sharan,et al.  Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks , 2006, J. Comput. Biol..

[9]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[10]  Noga Alon,et al.  The Probabilistic Method, Second Edition , 2004 .

[11]  Noga Alon,et al.  Simple Construction of Almost k-wise Independent Random Variables , 1992, Random Struct. Algorithms.

[12]  Noga Alon,et al.  Algorithmic construction of sets for k-restrictions , 2006, TALG.

[13]  Noga Alon,et al.  Construction of asymptotically good low-rate error-correcting codes through pseudo-random graphs , 1992, IEEE Trans. Inf. Theory.

[14]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[15]  Venkatesh Raman,et al.  Approximation Algorithms for Some Parameterized Counting Problems , 2002, ISAAC.

[16]  Noga Alon,et al.  Finding and counting given length cycles , 1997, Algorithmica.

[17]  Noga Alon,et al.  Color-coding , 1995, JACM.

[18]  Yossi Azar,et al.  Approximating Probability Distributions Using Small Sample Spaces , 1998, Comb..

[19]  Aravind Srinivasan,et al.  Splitters and near-optimal derandomization , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[20]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[21]  Thomas Zichner,et al.  Algorithm Engineering for Color-Coding to Facilitate Signaling Pathway Detection , 2007, APBC.