Triangle Estimation using Polylogarithmic Queries

Estimating the number of triangles in a graph is one of the most fundamental problems in sublinear algorithms. In this work, we provide the first approximate triangle counting algorithm using only polylogarithmic queries. Our query oracle "Tripartite Independent Set" (TIS) takes three disjoint sets of vertices $A$, $B$ and $C$ as input, and answers whether there exists a triangle having one endpoint in each of these three sets. Our query model is inspired by the "Bipartite Independent Set" (BIS) query oracle of Beame et al. (ITCS, 2018). Their algorithm for edge estimation requires only polylogarithmic BIS queries, where a BIS query takes two disjoint sets $A$ and $B$ as input and answers whether there is an edge with endpoints in $A$ and $B$. We extend the algorithmic framework of Beame et al., with TIS replacing BIS, for triangle counting using ideas from color coding due to Alon et al. (J. ACM, 1995) and a concentration inequality for sums of random variables with bounded dependency due to Janson (Rand. Struct. Alg., 2004).

[1]  Graham Cormode,et al.  A second look at counting triangles in graph streams (corrected) , 2017, Theor. Comput. Sci..

[2]  Srikanta Tirthapura,et al.  Parallel triangle counting in massive streaming graphs , 2013, CIKM.

[3]  Kun-Lung Wu,et al.  Counting and Sampling Triangles from a Graph Stream , 2013, Proc. VLDB Endow..

[4]  Dana Ron,et al.  Approximating average parameters of graphs , 2008, Random Struct. Algorithms.

[5]  Mam Riess Jones Color Coding , 1962, Human factors.

[6]  Dana Ron,et al.  Approximately Counting Triangles in Sublinear Time , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[7]  Thomas Sauerwald,et al.  Counting Arbitrary Subgraphs in Data Streams , 2012, ICALP.

[8]  Noga Alon,et al.  Finding and counting given length cycles , 1997, Algorithmica.

[9]  Svante Janson,et al.  Large deviations for sums of partly dependent random variables , 2004, Random Struct. Algorithms.

[10]  Alon Itai,et al.  Finding a minimum circuit in a graph , 1977, STOC '77.

[11]  Eric Price,et al.  A Hybrid Sampling Scheme for Triangle Counting , 2016, SODA.

[12]  Dana Ron,et al.  On approximating the number of k-cliques in sublinear time , 2017, STOC.

[13]  Ramana Rao Kompella,et al.  Graph sample and hold: a framework for big-graph analytics , 2014, KDD.

[14]  Christian Sohler,et al.  Counting triangles in data streams , 2006, PODS.

[15]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[16]  Mohammad Ghodsi,et al.  New Streaming Algorithms for Counting Triangles in Graphs , 2005, COCOON.

[17]  Dana Ron,et al.  Counting stars and other small subgraphs in sublinear time , 2010, SODA '10.

[18]  Cyrus Rashtchian,et al.  Edge Estimation with Independent Set Oracles , 2017, ITCS.

[19]  Uri Zwick,et al.  Listing Triangles , 2014, ICALP.