Balanced families of perfect hash functions and their applications

The construction of perfect hash functions is a well-studied topic. In this article, this concept is generalized with the following definition. We say that a family of functions from [<i>n</i>] to [<i>k</i>] is a Δ-balanced (<i>n,k</i>)-family of perfect hash functions if for every <i>S</i> ⊆ [<i>n</i>], | <i>S</i> |=<i>k</i>, the number of functions that are 1-1 on <i>S</i> is between <i>T</i>/Δ and Δ <i>T</i> for some constant <i>T</i>>0. The standard definition of a family of perfect hash functions requires that there will be at least one function that is 1-1 on <i>S</i>, for each <i>S</i> of size <i>k</i>. In the new notion of balanced families, we require the number of 1-1 functions to be almost the same (taking Δ to be close to 1) for every such <i>S</i>. Our main result is that for any constant Δ > 1, a Δ-balanced (<i>n,k</i>)-family of perfect hash functions of size 2<sup><i>O</i>(<i>k</i> log log <i>k</i>)</sup> log <i>n</i> can be constructed in time 2<sup><i>O</i>(<i>k</i> log log <i>k</i>)</sup> <i>n</i> log <i>n</i>. Using the technique of color-coding we can apply our explicit constructions to devise approximation algorithms for various counting problems in graphs. In particular, we exhibit a deterministic polynomial-time algorithm for approximating both the number of simple paths of length <i>k</i> and the number of simple cycles of size <i>k</i> for any <i>k</i> ≤ <i>O</i>(log <i>n</i>/log log log <i>n</i>) in a graph with <i>n</i> vertices. The approximation is up to any fixed desirable relative error.

[1]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[2]  Russell Impagliazzo,et al.  Which problems have strongly exponential complexity? , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[3]  Noga Alon,et al.  Finding and counting given length cycles , 1997, Algorithmica.

[4]  Noga Alon,et al.  Construction of asymptotically good low-rate error-correcting codes through pseudo-random graphs , 1992, IEEE Trans. Inf. Theory.

[5]  Raphael Yuster,et al.  Detecting short directed cycles using rectangular matrix multiplication and dynamic programming , 2004, SODA '04.

[6]  Moni Naor,et al.  Small-Bias Probability Spaces: Efficient Constructions and Applications , 1993, SIAM J. Comput..

[7]  Jörg Flum,et al.  The Parameterized Complexity of Counting Problems , 2004, SIAM J. Comput..

[8]  Russell Impagliazzo,et al.  Complexity of k-SAT , 1999, Proceedings. Fourteenth Annual IEEE Conference on Computational Complexity (Formerly: Structure in Complexity Theory Conference) (Cat.No.99CB36317).

[9]  Noga Alon,et al.  Biomolecular network motif counting and discovery by color coding , 2008, ISMB.

[10]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[11]  Roded Sharan,et al.  QPath: a method for querying pathways in a protein-protein interaction network , 2006, BMC Bioinformatics.

[12]  Noga Alon,et al.  The Probabilistic Method, Second Edition , 2004 .

[13]  Noga Alon,et al.  Algorithmic construction of sets for k-restrictions , 2006, TALG.

[14]  Thomas Zichner,et al.  Algorithm Engineering for Color-Coding to Facilitate Signaling Pathway Detection , 2007, APBC.

[15]  Aravind Srinivasan,et al.  Splitters and near-optimal derandomization , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[16]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[17]  Venkatesh Raman,et al.  Approximation Algorithms for Some Parameterized Counting Problems , 2002, ISAAC.

[18]  Nimrod Megiddo,et al.  Constructing small sample spaces satisfying given constraints , 1993, SIAM J. Discret. Math..

[19]  Yossi Azar,et al.  Approximating Probability Distributions Using Small Sample Spaces , 1998, Comb..

[20]  Mam Riess Jones Color Coding , 1962, Human factors.

[21]  Noga Alon,et al.  Simple Construction of Almost k-wise Independent Random Variables , 1992, Random Struct. Algorithms.

[22]  N. Alon,et al.  The Probabilistic Method, Second Edition , 2000 .

[23]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[24]  Noga Alon,et al.  Balanced Hashing, Color Coding and Approximate Counting , 2009, IWPEC.

[25]  Russell Impagliazzo,et al.  Which Problems Have Strongly Exponential Complexity? , 2001, J. Comput. Syst. Sci..

[26]  Roded Sharan,et al.  Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks , 2006, J. Comput. Biol..

[27]  Nimrod Megiddo,et al.  Constructing Small Sample Spaces Satisfying Given Constraints , 1994, SIAM J. Discret. Math..

[28]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[29]  Jeanette P. Schmidt,et al.  The Spatial Complexity of Oblivious k-Probe Hash Functions , 2018, SIAM J. Comput..

[30]  Noga Alon,et al.  Simple construction of almost k-wise independent random variables , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[31]  Russell Impagliazzo,et al.  On the Complexity of k-SAT , 2001, J. Comput. Syst. Sci..

[32]  Raphael Yuster,et al.  Finding Even Cycles Even Faster , 1994, SIAM J. Discret. Math..