A Deep Dive Into Understanding The Random Walk-Based Temporal Graph Learning

Machine learning on graph data has gained significant interest because of its applicability to various domains ranging from product recommendations to drug discovery. While there is a rapid growth in the algorithmic community, the computer architecture community has so far focused on a subset of graph learning algorithms including Graph Convolution Network (GCN), and a few others. In this paper, we study another, more scalable, graph learning algorithm based on random walks, which operates on dynamic input graphs and has attracted less attention in the architecture community compared to GCN. We propose high-performance CPU and GPU implementations of two important graph learning tasks, that cover a broad class of applications, using random walks on continuous-time dynamic graphs: link prediction and node classification. We show that the resulting workload exhibits distinct characteristics, measured in terms of irregularity, core and memory utilization, and cache hit rates, compared to graph traversals, deep learning, and GCN. We further conduct an in-depth performance analysis focused on both algorithm and hardware to guide future software optimization and architecture exploration. The algorithm-focused study presents a rich trade-off space between algorithmic performance and runtime complexity to identify optimization opportunities. We find an optimal hyperparameter setting that strikes balance in this trade-off space. Using this setting, we also perform a detailed microarchitectural characterization to analyze hardware behavior of these applications and uncover execution bottlenecks, which include high cache misses and dependency-related stalls. The outcome of our study includes recommendations for further performance optimization, and open-source implementations for future investigation.

[1]  Jure Leskovec,et al.  Motifs in Temporal Networks , 2016, WSDM.

[2]  Huan Liu,et al.  Attributed Network Embedding for Learning in a Dynamic Environment , 2017, CIKM.

[3]  Yafei Dai,et al.  NeuGraph: Parallel Deep Neural Network Computation on Large Graphs , 2019, USENIX Annual Technical Conference.

[4]  Davide Eynard,et al.  Fake News Detection on Social Media using Geometric Deep Learning , 2019, ArXiv.

[5]  Ryan A. Rossi,et al.  Continuous-Time Dynamic Network Embeddings , 2018, WWW.

[6]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[7]  Dongrui Fan,et al.  Characterizing and Understanding GCNs on GPU , 2020, IEEE Computer Architecture Letters.

[8]  Dongrui Fan,et al.  HyGCN: A GCN Accelerator with Hybrid Architecture , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[9]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[10]  Lieven Eeckhout,et al.  Microarchitecture-Independent Workload Characterization , 2007, IEEE Micro.

[11]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[12]  Keshav Pingali,et al.  A compiler for throughput optimization of graph algorithms on GPUs , 2016, OOPSLA.

[13]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[14]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[16]  Rajiv Gupta,et al.  KickStarter: Fast and Accurate Computations on Streaming Graphs via Trimmed Approximations , 2017, ASPLOS.

[17]  Jie Chen,et al.  EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs , 2020, AAAI.

[18]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[19]  Mark Heimann,et al.  node2bits: Compact Time- and Attribute-aware Node Representations for User Stitching , 2019, ECML/PKDD.

[20]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[21]  Pradeep Dubey,et al.  GraphMat: High performance graph analytics made productive , 2015, Proc. VLDB Endow..

[22]  Thomas Eiter,et al.  Linked Stream Data Processing Engines: Facts and Figures , 2012, SEMWEB.

[23]  David R. Kaeli,et al.  GNNMark: A Benchmark Suite to Characterize Graph Neural Network Training on GPUs , 2021, 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[24]  Ozcan Ozturk,et al.  Energy Efficient Architecture for Graph Analytics Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[25]  Ziqi Liu,et al.  AGL , 2020, Proc. VLDB Endow..

[26]  Jure Leskovec,et al.  Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks , 2019, KDD.

[27]  Chang Zhou,et al.  AliGraph: A Comprehensive Graph Neural Network Platform , 2019, Proc. VLDB Endow..

[28]  Kunle Olukotun,et al.  Green-Marl: a DSL for easy and efficient graph analysis , 2012, ASPLOS XVII.

[29]  Scott A. Mahlke,et al.  Prodigy: Improving the Memory Latency of Data-Indirect Irregular Workloads Using Hardware-Software Co-Design , 2021, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

[30]  Christoforos E. Kozyrakis,et al.  Making pull-based graph processing performant , 2018, PPoPP.

[31]  Lei He EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks , 2019, ArXiv.

[32]  Junjie Wu,et al.  Embedding Temporal Network via Neighborhood Formation , 2018, KDD.

[33]  Li Zhao,et al.  Analysis and Optimization of the Memory Hierarchy for Graph Processing Workloads , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[34]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[35]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[36]  Tekin Bicer,et al.  Graphphi: efficient parallel graph processing on emerging throughput-oriented architectures , 2018, PACT.

[37]  Emma J. Chory,et al.  A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.

[38]  Shoaib Kamil,et al.  PriorityGraph: A Unified Programming Model for Optimizing Ordered Graph Algorithms , 2019, ArXiv.

[39]  Shoaib Kamil,et al.  GraphIt: a high-performance graph DSL , 2018, Proc. ACM Program. Lang..

[40]  Daniel R. Figueiredo,et al.  struc2vec: Learning Node Representations from Structural Identity , 2017, KDD.

[41]  Alexander M. Bronstein,et al.  Deep Functional Maps: Structured Prediction for Dense Shape Correspondence , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[43]  Guy E. Blelloch,et al.  Smaller and Faster: Parallel Processing of Compressed Graphs with Ligra+ , 2015, 2015 Data Compression Conference.

[44]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[45]  Kunle Olukotun,et al.  EmptyHeaded: A Relational Engine for Graph Processing , 2015, ACM Trans. Database Syst..

[46]  Kevin Skadron,et al.  Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[47]  David A. Patterson,et al.  The GAP Benchmark Suite , 2015, ArXiv.

[48]  Hongyuan Zha,et al.  DyRep: Learning Representations over Dynamic Graphs , 2019, ICLR.

[49]  Haibo Chen,et al.  NUMA-aware graph-structured analytics , 2015, PPoPP.

[50]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[51]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[52]  Ahmed Louri,et al.  GCNAX: A Flexible and Energy-efficient Accelerator for Graph Convolutional Neural Networks , 2021, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

[53]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[54]  Dimitri Van De Ville,et al.  The dynamic functional connectome: State-of-the-art and perspectives , 2017, NeuroImage.

[55]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[56]  Liang Gou,et al.  DySAT: Deep Neural Representation Learning on Dynamic Graphs via Self-Attention Networks , 2020, WSDM.

[57]  Palash Goyal,et al.  dyngraph2vec: Capturing Network Dynamics using Dynamic Graph Representation Learning , 2018, Knowl. Based Syst..

[58]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Laxmi N. Bhuyan,et al.  Scalable SIMD-Efficient Graph Processing on GPUs , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).

[60]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[61]  Ajay Brahmakshatriya,et al.  Compiling Graph Applications for GPU s with GraphIt , 2021, 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[62]  Jure Leskovec,et al.  Governance in Social Media: A Case Study of the Wikipedia Promotion Process , 2010, ICWSM.

[63]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[64]  Guy E. Blelloch,et al.  Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.

[65]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[66]  Monica S. Lam,et al.  SociaLite: Datalog extensions for efficient social network analysis , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[67]  Nael Abu-Ghazaleh,et al.  GraphPulse: An Event-Driven Hardware Accelerator for Asynchronous Graph Processing , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[68]  Prabhat,et al.  Graph Neural Networks for IceCube Signal Classification , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[69]  Viktor Prasanna,et al.  GraphACT: Accelerating GCN Training on CPU-FPGA Heterogeneous Platforms , 2019, FPGA.

[70]  Keshav Pingali,et al.  Abelian: A Compiler for Graph Analytics on Distributed, Heterogeneous Platforms , 2018, Euro-Par.

[71]  Margaret Martonosi,et al.  Graphicionado: A high-performance and energy-efficient accelerator for graph analytics , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[72]  Ryan A. Rossi,et al.  role2vec: Role-based Network Embeddings , 2019 .

[73]  Gita Alaghband,et al.  Efficient and accurate Word2Vec implementations in GPU and shared-memory multicore architectures , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[74]  Minyi Guo,et al.  Architectural Implications of Graph Neural Networks , 2020, IEEE Computer Architecture Letters.

[75]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[76]  Alex Smola,et al.  Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs , 2019, ArXiv.

[77]  Philip S. Yu,et al.  On Clustering Graph Streams , 2010, SDM.

[78]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[79]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[80]  Jari Saramäki,et al.  Temporal Networks , 2011, Encyclopedia of Social Network Analysis and Mining.

[81]  Jose-Maria Arnau,et al.  SCU: A GPU Stream Compaction Unit for Graph Processing , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).

[82]  S. Reinhardt,et al.  AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing , 2019, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[83]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[84]  J. Leskovec,et al.  Cascading Behavior in Large Blog Graphs Patterns and a model , 2006 .

[85]  Matei Zaharia,et al.  Making caches work for graph analytics , 2016, 2017 IEEE International Conference on Big Data (Big Data).

[86]  Dimitrios S. Nikolopoulos,et al.  GraphGrind: addressing load imbalance of graph partitioning , 2017, ICS.

[87]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[88]  Keshav Pingali,et al.  A quantitative study of irregular programs on GPUs , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).

[89]  Sam Ainsworth,et al.  Graph Prefetching Using Data Structure Knowledge , 2016, ICS.

[90]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[91]  Zhimin Zhang,et al.  Alleviating Irregularity in Graph Analytics Acceleration: a Hardware/Software Co-Design Approach , 2019, MICRO.

[92]  Xiang Zhang,et al.  Spatio-Temporal Attentive RNN for Node Classification in Temporal Attributed Graphs , 2019, IJCAI.

[93]  Kunle Olukotun,et al.  Accelerating CUDA graph algorithms at maximum warp , 2011, PPoPP '11.