Price of Precision in Coded Distributed Matrix Multiplication: A Dimensional Analysis

Coded distributed matrix multiplication (CDMM) schemes, such as MatDot codes, seek efficient ways to distribute matrix multiplication task(s) to a set of N distributed servers so that the answers returned from any R servers are sufficient to recover the desired product(s). For example, to compute the product of matrices U, V, MatDot codes partition each matrix into $p\gt1$ sub-matrices to create smaller coded computation tasks that reduce the upload/storage at each server by $1 / p$, such that UV can be recovered from the answers returned by any $R=2 p-1$ servers. An important concern in CDMM is to reduce the recovery threshold R for a given storage/upload constraint. Recently, Jeong et al. introduced Approximate MatDot (AMD) codes that are shown to improve the recovery threshold by a factor of nearly 2, from $2 p-1$ to p. A key observation that motivates our work is that the storage/upload required for approximate computing depends not only on the dimensions of the (coded) sub-matrices that are assigned to each server, but also on their precision levels - a critical aspect that is not explored by Jeong et al. Our main contribution is a rudimentary asymptotic dimensional analysis of AMD codes inspired by the Generalized Degrees of Freedom (GDoF) framework previously developed for wireless networks, which indicates that for the same upload/storage, once the precision levels of the task assignments are accounted for, AMD codes are not better than a replication scheme which assigns the full computation task to every server. The dimensional analysis is supported by simple numerical experiments.

[1]  Syed Ali Jafar,et al.  Aligned Image Sets Under Channel Uncertainty: Settling Conjectures on the Collapse of Degrees of Freedom Under Finite Precision CSIT , 2014, IEEE Transactions on Information Theory.

[2]  Pulkit Grover,et al.  Coded convolution for parallel and distributed computing within a deadline , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[3]  Syed A. Jafar,et al.  Interference Alignment: A New Look at Signal Dimensions in a Communication Network , 2011, Found. Trends Commun. Inf. Theory.

[4]  Kangwook Lee,et al.  Matrix sparsification for coded matrix multiplication , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[5]  Mohammad Ali Maddah-Ali,et al.  Coded fourier transform , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[6]  Anoosheh Heidarzadeh,et al.  Random Khatri-Rao-Product Codes for Numerically-Stable Distributed Matrix Multiplication , 2019, 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  Mohammad Ali Maddah-Ali,et al.  Coding for Distributed Fog Computing , 2017, IEEE Communications Magazine.

[8]  Mohammad Ali Maddah-Ali,et al.  CodedSketch: A Coding Scheme for Distributed Computation of Approximated Matrix Multiplication , 2018, IEEE Transactions on Information Theory.

[9]  Mohammad Ali Maddah-Ali,et al.  Straggler Mitigation in Distributed Matrix Multiplication: Fundamental Limits and Optimal Coding , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[10]  Jaekyun Moon,et al.  Coded Matrix Multiplication on a Group-Based Model , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[11]  Viveck R. Cadambe,et al.  Numerically Stable Polynomially Coded Computing , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[12]  Victor Y. Pan,et al.  How Bad Are Vandermonde Matrices? , 2015, SIAM J. Matrix Anal. Appl..

[13]  Syed A. Jafar,et al.  Cross Subspace Alignment Codes for Coded Distributed Batch Computation. , 2019 .

[14]  Jaekyun Moon,et al.  Hierarchical Coding for Distributed Computing , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[15]  Syed A. Jafar,et al.  Sum-set inequalities from aligned image sets: Instruments for robust GDoF bounds , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[16]  Syed A. Jafar,et al.  GCSA Codes with Noise Alignment for Secure Coded Multi-Party Batch Matrix Multiplication , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[17]  W. Gautschi,et al.  Lower bounds for the condition number of Vandermonde matrices , 1987 .

[18]  Amir Salman Avestimehr,et al.  Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy , 2018, AISTATS.

[19]  Mohammad Ali Maddah-Ali,et al.  Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication , 2017, NIPS.

[20]  Aditya Ramamoorthy,et al.  Numerically stable coded matrix computations via circulant and rotation matrix embeddings , 2019, 2021 IEEE International Symposium on Information Theory (ISIT).

[21]  Ness B. Shroff,et al.  Coded Sparse Matrix Multiplication , 2018, ICML.

[22]  Ness B. Shroff,et al.  Fundamental Limits of Coded Linear Transform , 2018, ArXiv.

[23]  Pulkit Grover,et al.  “Short-Dot”: Computing Large Linear Transforms Distributedly Using Coded Short Dot Products , 2017, IEEE Transactions on Information Theory.

[24]  F. Calmon,et al.  ϵ-Approximate Coded Matrix Multiplication Is Nearly Twice as Efficient as Exact Multiplication , 2021, IEEE Journal on Selected Areas in Information Theory.

[25]  Albin Severinson,et al.  Block-Diagonal and LT Codes for Distributed Computing With Straggling Servers , 2017, IEEE Transactions on Communications.

[26]  Kannan Ramchandran,et al.  Straggler-Proofing Massive-Scale Distributed Matrix Multiplication with D-Dimensional Product Codes , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[27]  Syed Ali Jafar,et al.  GDoF Region of the MISO BC: Bridging the Gap Between Finite Precision and Perfect CSIT , 2018, IEEE Transactions on Information Theory.

[28]  Malhar Chaudhari,et al.  Rateless codes for near-perfect load balancing in distributed matrix-vector multiplication , 2018, Proc. ACM Meas. Anal. Comput. Syst..

[29]  Kannan Ramchandran,et al.  High-dimensional coded matrix multiplication , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[30]  Farzin Haddadpour,et al.  Codes for Distributed Finite Alphabet Matrix-Vector Multiplication , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[31]  Kannan Ramchandran,et al.  Speeding Up Distributed Machine Learning Using Codes , 2015, IEEE Transactions on Information Theory.

[32]  Yaoqing Yang,et al.  An Application of Storage-Optimal MatDot Codes for Coded Matrix Multiplication: Fast k-Nearest Neighbors Estimation , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[33]  Tze Meng Low,et al.  A Unified Coded Deep Neural Network Training Strategy based on Generalized PolyDot codes , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[34]  Amir K. Khandani,et al.  Real Interference Alignment with Real Numbers , 2009, ArXiv.

[35]  Pulkit Grover,et al.  Locally Recoverable Coded Matrix Multiplication , 2018, 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[36]  Amir Salman Avestimehr,et al.  Coded computation over heterogeneous clusters , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[37]  Hua Wang,et al.  Gaussian Interference Channel Capacity to Within One Bit , 2007, IEEE Transactions on Information Theory.

[38]  Syed Ali Jafar,et al.  Interference Alignment and Degrees of Freedom of the $K$-User Interference Channel , 2008, IEEE Transactions on Information Theory.

[39]  Farzin Haddadpour,et al.  On the optimal recovery threshold of coded matrix multiplication , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).