InferTurbo: A Scalable System for Boosting Full-graph Inference of Graph Neural Network over Huge Graphs

GNN inference is a non-trivial task, especially in industrial scenarios with giant graphs, given three main challenges, i.e., scalability tailored for full-graph inference on huge graphs, inconsistency caused by stochastic acceleration strategies (e.g., sampling), and the serious redundant computation issue. To address the above challenges, we propose a scalable system named InferTurbo to boost the GNN inference tasks in industrial scenarios. Inspired by the philosophy of ``think-like-a-vertex", a GAS-like (Gather-Apply-Scatter) schema is proposed to describe the computation paradigm and data flow of GNN inference. The computation of GNNs is expressed in an iteration manner, in which a vertex would gather messages via in-edges and update its state information by forwarding an associated layer of GNNs with those messages and then send the updated information to other vertexes via out-edges. Following the schema, the proposed InferTurbo can be built with alternative backends (e.g., batch processing system or graph computing system). Moreover, InferTurbo introduces several strategies like shadow-nodes and partial-gather to handle nodes with large degrees for better load balancing. With InferTurbo, GNN inference can be hierarchically conducted over the full graph without sampling and redundant computation. Experimental results demonstrate that our system is robust and efficient for inference tasks over graphs containing some hub nodes with many adjacent edges. Meanwhile, the system gains a remarkable performance compared with the traditional inference pipeline, and it can finish a GNN inference task over a graph with tens of billions of nodes and hundreds of billions of edges within 2 hours.

[1]  Ziqi Liu,et al.  MERIT: Learning Multi-level Representations on Temporal Graphs , 2022, IJCAI.

[2]  Ziqi Liu,et al.  Intent Mining: A Social and Semantic Enhanced Topic Model for Operation-Friendly Digital Marketing , 2022, 2022 IEEE 38th International Conference on Data Engineering (ICDE).

[3]  Peng Cui,et al.  Conditional Graph Attention Networks for Distilling and Refining Knowledge Graphs in Recommendation , 2021, CIKM.

[4]  Yizhou Sun,et al.  Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation , 2021, ICLR.

[5]  Susmita Dey Manasi,et al.  GNNIE: GNN inference engine with load-balancing and graph-specific caching , 2021, DAC.

[6]  Rajgopal Kannan,et al.  Accelerating Large Scale Real-Time GNN Inference using Channel Pruning , 2021, Proc. VLDB Endow..

[7]  James Cheng,et al.  Seastar: vertex-centric programming for graph neural networks , 2021, EuroSys.

[8]  Shaosheng Cao,et al.  IntelliTag: An Intelligent Cloud Customer Service System Based on Tag Recommendation , 2021, 2021 IEEE 37th International Conference on Data Engineering (ICDE).

[9]  Jure Leskovec,et al.  OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs , 2021, NeurIPS Datasets and Benchmarks.

[10]  G. Karypis,et al.  DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs , 2020, 2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3).

[11]  Zhiqiang Zhang,et al.  Financial Risk Analysis for SMEs with Graph-based Supply Chain Mining , 2020, IJCAI.

[12]  Le Song,et al.  Bandit Samplers for Training Graph Neural Networks , 2020, NeurIPS.

[13]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[14]  Ziqi Liu,et al.  AGL , 2020, Proc. VLDB Endow..

[15]  Xin Huang,et al.  DSSLP: A Distributed Framework for Semi-supervised Link Prediction , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[16]  Leon Wenliang Zhong,et al.  Graph Representation Learning for Merchant Incentive Optimization in Mobile Payment Marketing , 2019, CIKM.

[17]  G. Karypis,et al.  Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. , 2019 .

[18]  Jiebo Luo,et al.  Graph-based Neural Sentence Ordering , 2019, IJCAI.

[19]  Yafei Dai,et al.  NeuGraph: Parallel Deep Neural Network Computation on Large Graphs , 2019, USENIX ATC.

[20]  Yixin Chen,et al.  Inductive Graph Pattern Learning for Recommender Systems Based on a Graph Neural Network , 2019, ArXiv.

[21]  Alexander Peysakhovich,et al.  PyTorch-BigGraph: A Large-scale Graph Embedding System , 2019, SysML.

[22]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[23]  Chang Zhou,et al.  AliGraph: A Comprehensive Graph Neural Network Platform , 2019, Proc. VLDB Endow..

[24]  Le Song,et al.  Heterogeneous Graph Neural Networks for Malicious Account Detection , 2018, CIKM.

[25]  Junzhou Huang,et al.  Adaptive Sampling Towards Fast Graph Representation Learning , 2018, NeurIPS.

[26]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[27]  J. Christopher Westland,et al.  Private Information, Credit Risk and Graph Structure in P2P Lending Networks , 2018, ArXiv.

[28]  Xiaolong Li,et al.  GeniePath: Graph Neural Networks with Adaptive Receptive Paths , 2018, AAAI.

[29]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[30]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[31]  Yixin Chen,et al.  Weisfeiler-Lehman Neural Machine for Link Prediction , 2017, KDD.

[32]  Yinghui Wu,et al.  GRAPE: Parallelizing Sequential Graph Computations , 2017, Proc. VLDB Endow..

[33]  Jure Leskovec,et al.  Predicting multicellular function through multi-layer tissue networks , 2017, Bioinform..

[34]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[35]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[36]  Abhinav Gupta,et al.  The More You Know: Using Knowledge Graphs for Image Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Wenguang Chen,et al.  Gemini: A Computation-Centric Distributed Graph Processing System , 2016, OSDI.

[38]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[39]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[40]  Radwa El Shawi,et al.  Large scale graph processing systems: survey and an experimental evaluation , 2015, Cluster Computing.

[41]  Avery Ching,et al.  One Trillion Edges: Graph Processing at Facebook-Scale , 2015, Proc. VLDB Endow..

[42]  Willy Zwaenepoel,et al.  X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.

[43]  Carlos Guestrin,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 31 Graphchi: Large-scale Graph Computation on Just a Pc , 2022 .

[44]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[45]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[46]  Albert Chan,et al.  CGMGRAPH/CGMLIB: Implementing and Testing CGM Graph Algorithms on PC Clusters and Shared Memory Machines , 2005, Int. J. High Perform. Comput. Appl..

[47]  Andrew V. Goldberg,et al.  Computing the shortest path: A search meets graph theory , 2005, SODA '05.

[48]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[49]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[50]  M. Stonebraker Proceedings of the 2010 ACM SIGMOD International Conference on Management of data , 1983, SIGMOD 1992.

[51]  A. Preprint,et al.  R-UNIMP: SOLUTION FOR KDDCUP 2021 MAG240M-LSC , 2021 .

[52]  Qiang Fu,et al.  GIN : High-Performance, Scalable Inference for Graph Neural Networks , 2020 .

[53]  2019 IEEE International Conference on Big Data (Big Data) , 2019 .

[54]  Carlos Guestrin,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012 .

[55]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[56]  J. van Leeuwen,et al.  Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.