Hyperedge Estimation using Polylogarithmic Subset Queries

A hypergraph ${\cal H}$ is a \emph{set system} $(U({\cal H}),{\cal F}(H))$, where $U({\cal H})$ denotes the set of $n$ vertices and ${\cal F}(H)$, a set of subsets of $U({\cal H})$, denotes the set of hyperedges. A hypergraph ${\cal H}$ is said to be $d$-uniform if every hyperedge in ${\cal H}$ consists of exactly $d$ vertices. The cardinality of the hyperedge set is denoted as $\ |{{\cal F}({\cal H})}\ |=m({\cal H})$. We consider an oracle access to the hypergraph ${\cal H}$ of the following form. Given $d$ (non-empty) pairwise disjoint subsets of vertices $A_1,\ldots,A_d \subseteq U({\cal H})$ of hypergraph ${\cal H}$, the oracle, known as the {\bf Generalized $d$-partite independent set oracle (\gpis)}~(that was introduced in \cite{BishnuGKM018}), answers {\sc Yes} if and only if there exists a hyperedge in ${\cal H}$ having (exactly) one vertex in each $A_i, i \in [d]$. The \gpis{} oracle belongs to the class of oracles for subset queries. The study of subset queries was initiated by Stockmeyer \cite{Stockmeyer85}, and later the model was formalized by Ron and Tsur \cite{RonT16}. Subset queries generalize the set membership queries. In this work we give an algorithm for the \hest problem using the \gpis query oracle to obtain an estimate $\widehat{m}$ for $m({\cal H})$ satisfying $(1 - \epsilon) \cdot m({\cal H}) \leq \widehat{m} \leq (1 + \epsilon) \cdot m({\cal H})$. The number of queries made by our algorithm, assuming $d$ as a constant, is polylogarithmic in the number of vertices of the hypergraph. Our work can be seen as a natural generalization of {\sc Edge Estimation} using {\sc Bipartite Independent Set}({\sc BIS}) oracle \cite{BeameHRRS18} and {\sc Triangle Estimation} using {\sc Tripartite Independent Set}({\sc TIS}) oracle \cite{Bhatta-abs-1808-00691}.

[1]  Dana Ron,et al.  Approximately Counting Triangles in Sublinear Time , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[2]  Arijit Ghosh,et al.  Triangle Estimation using Polylogarithmic Queries , 2018, ArXiv.

[3]  Dana Ron,et al.  The Power of an Example , 2014, ACM Trans. Comput. Theory.

[4]  Xi Chen,et al.  Nearly optimal edge estimation with independent set queries , 2020, SODA.

[5]  S. Matthew Weinberg,et al.  Computing Exact Minimum Cuts Without Knowing the Graph , 2017, ITCS.

[6]  Oded Goldreich,et al.  Introduction to Property Testing , 2017 .

[7]  Saket Saurabh,et al.  Parameterized Query Complexity of Hitting Set using Stability of Sunflowers , 2018, ISAAC.

[8]  Kitty Meeks,et al.  Approximately counting and sampling small witnesses using a colourful decision oracle , 2019, SODA.

[9]  Uriel Feige,et al.  On sums of independent random variables with unbounded variance, and estimating the average degree in a graph , 2004, STOC '04.

[10]  Larry J. Stockmeyer,et al.  On Approximation Algorithms for #P , 1985, SIAM J. Comput..

[11]  Dana Ron,et al.  On approximating the number of k-cliques in sublinear time , 2017, STOC.

[12]  Dana Ron,et al.  Counting stars and other small subgraphs in sublinear time , 2010, SODA '10.

[13]  Jeong Han Kim,et al.  Optimal query complexity bounds for finding graphs , 2008, Artif. Intell..

[14]  Cyrus Rashtchian,et al.  Edge Estimation with Independent Set Oracles , 2017, ITCS.

[15]  Dana Ron,et al.  Finding cycles and trees in sublinear time , 2010, Random Struct. Algorithms.

[16]  Holger Dell,et al.  Fine-grained reductions from approximate counting to decision , 2017, STOC.

[17]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[18]  Larry J. Stockmeyer,et al.  The complexity of approximate counting , 1983, STOC.

[19]  Dana Ron,et al.  Approximating average parameters of graphs , 2008, Random Struct. Algorithms.