TACC: Topology-Aware Coded Computing for Distributed Graph Processing

This article proposes a coded distributed graph processing framework to alleviate the communication bottleneck in large-scale distributed graph processing. In particular, we propose a topology-aware coded computing (TACC) algorithm that has two novel salient features: (i) a topology-aware graph allocation strategy, and (ii) a coded aggregation scheme that combines the intermediate computations for graph processing while constructing coded messages. The proposed setup results in a trade-off between computation and communication, in that increasing the computation load at the distributed parties can in turn reduce the communication load. We demonstrate the effectiveness of the TACC algorithm by comparing the communication load with existing setups on both Erdös-Rényi and Barabási-Albert type random graphs, as well as real-world Google web graph for PageRank computations. In particular, we show that the proposed coding strategy can lead to up to <inline-formula><tex-math notation="LaTeX">$82\%$</tex-math></inline-formula> reduction in communication load and up to <inline-formula><tex-math notation="LaTeX">$46\%$</tex-math></inline-formula> reduction in overall execution time, when compared to the state-of-the-art and implemented on the Amazon EC2 cloud compute platform.

[1]  F. Chung,et al.  Connected Components in Random Graphs with Given Expected Degree Sequences , 2002 .

[2]  Jonathan W. Berry,et al.  Challenges in Parallel Graph Processing , 2007, Parallel Process. Lett..

[3]  Gauri Joshi,et al.  Rateless Codes for Distributed Computations with Sparse Compressed Matrices , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[4]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[5]  Aylin Yener,et al.  Coded Caching for Heterogeneous Systems: An Optimization Perspective , 2018, IEEE Transactions on Communications.

[6]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[7]  Albin Severinson,et al.  Block-Diagonal and LT Codes for Distributed Computing With Straggling Servers , 2017, IEEE Transactions on Communications.

[8]  Amir Salman Avestimehr,et al.  Coded Computing for Distributed Graph Analytics , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[9]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[10]  Pulkit Grover,et al.  “Short-Dot”: Computing Large Linear Transforms Distributedly Using Coded Short Dot Products , 2017, IEEE Transactions on Information Theory.

[11]  Jaime Llorca,et al.  Order-Optimal Rate of Caching and Coded Multicasting With Random Demands , 2015, IEEE Transactions on Information Theory.

[12]  Alexandros G. Dimakis,et al.  Gradient Coding: Avoiding Stragglers in Distributed Learning , 2017, ICML.

[13]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[14]  Malhar Chaudhari,et al.  Fast and Efficient Distributed Matrix-vector Multiplication Using Rateless Fountain Codes , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Mario A. Storti,et al.  MPI for Python , 2005, J. Parallel Distributed Comput..

[16]  Suhas N. Diggavi,et al.  Straggler Mitigation in Distributed Optimization Through Data Encoding , 2017, NIPS.

[17]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[18]  Mohammad Ali Maddah-Ali,et al.  Compressed Coded Distributed Computing , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[19]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[20]  Soummya Kar,et al.  Coded Distributed Computing for Inverse Problems , 2017, NIPS.

[21]  Kannan Ramchandran,et al.  Speeding Up Distributed Machine Learning Using Codes , 2015, IEEE Transactions on Information Theory.

[22]  Béla Bollobás,et al.  Random Graphs , 1985 .

[23]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[24]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[25]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .

[26]  Tamara G. Kolda,et al.  Graph partitioning models for parallel computing , 2000, Parallel Comput..

[27]  Mohammad Ali Maddah-Ali,et al.  Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication , 2017, NIPS.

[28]  Urs Niesen,et al.  Decentralized coded caching attains order-optimal memory-rate tradeoff , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[29]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[30]  F. Chung,et al.  The average distances in random graphs with given expected degrees , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Peter Sanders,et al.  Recent Advances in Graph Partitioning , 2013, Algorithm Engineering.

[32]  Suhas N. Diggavi,et al.  Encoded distributed optimization , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[33]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[34]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[35]  Pierre Vandergheynst,et al.  Graph Signal Processing: Overview, Challenges, and Applications , 2017, Proceedings of the IEEE.

[36]  A. Martin-Löf,et al.  Generating Simple Random Graphs with Prescribed Degree Distribution , 2006, 1509.06985.

[37]  Deniz Gündüz,et al.  Fundamental Limits of Coded Caching: Improved Delivery Rate-Cache Capacity Tradeoff , 2017, IEEE Transactions on Communications.

[38]  Antonio Ortega,et al.  A Topology-aware Coding Framework for Distributed Graph Processing , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[39]  Aditya Ramamoorthy,et al.  Leveraging Coding Techniques for Speeding up Distributed Computing , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[40]  Binyu Zang,et al.  Computation and communication efficient graph processing with distributed immutable view , 2014, HPDC '14.

[41]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[42]  Suhas N. Diggavi,et al.  Redundancy Techniques for Straggler Mitigation in Distributed Optimization and Learning , 2018, J. Mach. Learn. Res..

[43]  A. Salman Avestimehr,et al.  A Fundamental Tradeoff Between Computation and Communication in Distributed Computing , 2016, IEEE Transactions on Information Theory.

[44]  Amir Salman Avestimehr,et al.  Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy , 2018, AISTATS.

[45]  M. Fiedler A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory , 1975 .

[46]  José M. F. Moura,et al.  Discrete Signal Processing on Graphs , 2012, IEEE Transactions on Signal Processing.

[47]  Horst D. Simon,et al.  Partitioning of unstructured problems for parallel processing , 1991 .