A Combinatorial Design for Cascaded Coded Distributed Computing on General Networks

Coding theoretic approached have been developed to significantly reduce the communication load in modern distributed computing system. In particular, coded distributed computing (CDC) introduced by Li et al. can efficiently trade computation resources to reduce the communication load in MapReduce like computing systems. For the more general cascaded CDC, Map computations are repeated at r nodes to significantly reduce the communication load among nodes tasked with computing Q Reduce functions s times. In this paper, we propose a novel low-complexity combinatorial design for cascaded CDC which 1) determines both input file and output function assignments, 2) requires significantly less number of input files and output functions, and 3) operates on heterogeneous networks where nodes have varying storage and computing capabilities. We provide an analytical characterization of the computation-communication tradeoff, from which we show the proposed scheme can outperform the state-of-the-art scheme proposed by Li et al. for the homogeneous networks. Further, when the network is heterogeneous, we show that the performance of the proposed scheme can be better than its homogeneous counterpart. In addition, the proposed scheme is optimal within a constant factor of the information theoretic converse bound while fixing the input file and the output function assignments.

[1]  Fan Li,et al.  Distributed Computing with Heterogeneous Communication Constraints: The Worst-Case Computation Load and Proof by Contradiction , 2018, ArXiv.

[2]  Meixia Tao,et al.  Heterogeneous Coded Distributed Computing: Joint Design of File Allocation and Function Assignment , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[3]  Aditya Ramamoorthy,et al.  Resolvable Designs for Speeding Up Distributed Computing , 2019, IEEE/ACM Transactions on Networking.

[4]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[5]  Amir Salman Avestimehr,et al.  On Heterogeneous Coded Distributed Computing , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[6]  Rong-Rong Chen,et al.  Coded Distributed Computing with Heterogeneous Function Assignments , 2019, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[7]  Rong-Rong Chen,et al.  Cascaded Coded Distributed Computing on Heterogeneous Networks , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[8]  Rong-Rong Chen,et al.  A New Combinatorial Design of Coded Distributed Computing , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[9]  A. Salman Avestimehr,et al.  A Fundamental Tradeoff Between Computation and Communication in Distributed Computing , 2016, IEEE Transactions on Information Theory.

[10]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[11]  Rong-Rong Chen,et al.  A New Combinatorial Coded Design for Heterogeneous Distributed Computing , 2020, IEEE Transactions on Communications.

[12]  Daniela Tuninetti,et al.  An Index Coding Approach to Caching With Uncoded Cache Placement , 2020, IEEE Transactions on Information Theory.