Accelerating All-Edge Common Neighbor Counting on Three Processors

We propose to accelerate an important but time-consuming operation in online graph analytics, which is the counting of common neighbors for each pair of adjacent vertices (u,v), or edge (u,v), on three modern processors of different architectures. We study two representative algorithms for this problem: (1) a merge-based pivot-skip algorithm (MPS) that intersects the two sets of neighbor vertices of each edge (u,v) to obtain the count; and (2) a bitmap-based algorithm (BMP), which dynamically constructs a bitmap index on the neighbor set of each vertex u, and for each neighbor v of u, looks up v's neighbors in u's bitmap. We parallelize and optimize both algorithms on a multicore CPU, an Intel Xeon Phi Knights Landing processor (KNL), and an NVIDIA GPU. Our experiments show that (1) Both the CPU and the GPU favor BMP whereas MPS wins on the KNL; (2) Across all datasets, the best performer is either MPS on the KNL or BMP on the GPU; and (3) Our optimized algorithms can complete the operation within tens of seconds on billion-edge Twitter graphs, enabling online analytics.

[1]  Leonid Boytsov,et al.  SIMD compression and the intersection of sorted integers , 2014, Softw. Pract. Exp..

[2]  Lei Zou,et al.  Speeding Up Set Intersections in Graph Algorithms using SIMD Instructions , 2018, SIGMOD Conference.

[3]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[4]  Ira Assent,et al.  Scalable and Interactive Graph Clustering Algorithm on Multicore CPUs , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[5]  William Pugh,et al.  A skip list cookbook , 1990 .

[6]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[7]  Julian Shun,et al.  Multicore triangle computations without tuning , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[8]  Hiroyuki Kitagawa,et al.  ScaleSCAN: Scalable Density-Based Graph Clustering , 2018, DEXA.

[9]  Philip Bille,et al.  Fast Evaluation of Union-Intersection Expressions , 2007, ISAAC.

[10]  Hiroshi Inoue,et al.  Faster Set Intersection with SIMD instructions by Reducing Branch Mispredictions , 2014, Proc. VLDB Endow..

[11]  Owen Kaser,et al.  Consistently faster and smaller compressed bitmaps with Roaring , 2016, Softw. Pract. Exp..

[12]  Hiroyuki Kitagawa,et al.  SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors , 2017, NDA@SIGMOD.

[13]  Bolin Ding,et al.  Fast Set Intersection in Memory , 2011, Proc. VLDB Endow..

[14]  Qiong Luo,et al.  Parallelizing Pruning-based Graph Structural Clustering , 2018, ICPP.

[15]  Erik D. Demaine,et al.  Adaptive set intersections, unions, and differences , 2000, SODA '00.

[16]  Lu Qin,et al.  pSCAN: Fast and exact structural graph clustering , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[17]  Ricardo A. Baeza-Yates,et al.  Experimental Analysis of a Fast Intersection Algorithm for Sorted Sequences , 2005, SPIRE.

[18]  Alejandro López-Ortiz,et al.  An experimental investigation of set intersection algorithms for text searching , 2010, JEAL.

[19]  Yasuhiro Fujiwara,et al.  SCAN++: Efficient Algorithm for Finding Clusters, Hubs and Outliers on Large-scale Graphs , 2015, Proc. VLDB Endow..

[20]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[21]  Kunle Olukotun,et al.  EmptyHeaded: A Relational Engine for Graph Processing , 2015, ACM Trans. Database Syst..

[22]  Yangjun Chen,et al.  An efficient method to evaluate intersections on big data sets , 2016, Theor. Comput. Sci..

[23]  Qiong Luo,et al.  Efficient Parallel Subgraph Enumeration on a Single Machine , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[24]  Kyomin Jung,et al.  LinkSCAN*: Overlapping community detection using the link-space transformation , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[25]  Ricardo A. Baeza-Yates,et al.  A Fast Set Intersection Algorithm for Sorted Sequences , 2004, CPM.

[26]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[27]  Lijun Chang,et al.  Efficient structural graph clustering: an index-based approach , 2017, The VLDB Journal.