A New Design Framework for Heterogeneous Uncoded Storage Elastic Computing

Elasticity is one important feature in modern cloud computing systems and can result in computation failure or significantly increase computing time. Such elasticity means that virtual machines over the cloud can be preempted under a short notice (e.g., hours or minutes) if a high-priority job appears; on the other hand, new virtual machines may become available over time to compensate the computing resources. Coded Storage Elastic Computing (CSEC) introduced by Yang et al. in 2018 is an effective and efficient approach to overcome the elasticity and it costs relatively less storage and computation load. However, one of the limitations of the CSEC is that it may only be applied to certain types of computations (e.g., linear) and may be challenging to be applied to more involved computations because the coded data storage and approximation are often needed. Hence, it may be preferred to use uncoded storage by directly copying data into the virtual machines. In addition, based on our own measurement, virtual machines on Amazon EC2 clusters often have heterogeneous computation speed even if they have exactly the same configurations (e.g., CPU, RAM, I/O cost). In this paper, we introduce a new optimization framework on Uncoded Storage Elastic Computing (USEC) systems with heterogeneous computing speed to minimize the overall computation time. Under this framework, we propose optimal solutions of USEC systems with or without straggler tolerance using different storage placements. Our proposed algorithms are evaluated using power iteration applications on Amazon EC2.

[1]  Alexandros G. Dimakis,et al.  Gradient Coding: Avoiding Stragglers in Distributed Learning , 2017, ICML.

[2]  Rong-Rong Chen,et al.  A Practical Algorithm Design and Evaluation for Heterogeneous Elastic Computing with Stragglers , 2021, 2021 IEEE Global Communications Conference (GLOBECOM).

[3]  Min Ye,et al.  Communication-Computation Efficient Gradient Coding , 2018, ICML.

[4]  Rong-Rong Chen,et al.  Coded Elastic Computing on Machines With Heterogeneous Storage and Computation Speed , 2020, IEEE Transactions on Communications.

[5]  Soummya Kar,et al.  Coded Elastic Computing , 2018, 2019 IEEE International Symposium on Information Theory (ISIT).

[6]  Zahir Tari,et al.  Optimizing the Transition Waste in Coded Elastic Computing , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[7]  Urs Niesen,et al.  Fundamental limits of caching , 2012, 2013 IEEE International Symposium on Information Theory.

[8]  Stark C. Draper,et al.  Hierarchical Coded Elastic Computing , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Rong-Rong Chen,et al.  Uncoded Placement With Linear Sub-Messages for Private Information Retrieval From Storage Constrained Databases , 2020, IEEE Transactions on Communications.

[10]  Rong-Rong Chen,et al.  Heterogeneous Computation Assignments in Coded Elastic Computing , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[11]  Alexandros G. Dimakis,et al.  Gradient Coding From Cyclic MDS Codes and Expander Graphs , 2017, IEEE Transactions on Information Theory.