Superlinear speedup in Windows Azure cloud

Azure cloud offers Platform-as-a-Service (PaaS) to its customers. It uses Hyper-V hypervisor to manage instances of virtual machines. Several papers report superlinear speedup for execution of dense matrix-matrix multiplication algorithm comparing the parallel and sequential executions. The existence of a superlinear speedup is also confirmed in some cases of virtualized and cloud environments. In this paper we realized a series of experiments to find out if the superlinear speedup is also possible on the Azure cloud for executing the dense matrix-matrix multiplication algorithm using the same hardware infrastructure. In addition to the hypothesis about existence of a superlinear speedup we will theoretically explain the experimental results and determine the regions and particular matrix sizes where it may occur.

[1]  Stephen Jenks MULTITHREADING AND THREAD MIGRATION USING MPI AND MYRINET , 2004 .

[2]  Jameela Al-Jaroodi,et al.  An agent-based infrastructure for parallel Java on heterogeneous clusters , 2002, Proceedings. IEEE International Conference on Cluster Computing.

[3]  Sasko Ristov,et al.  The Optimal Resource Allocation Among Virtual Machines in Cloud Computing , 2012, CLOUD 2012.

[4]  Sasko Ristov,et al.  Matrix multiplication performance analysis in virtualized shared memory multiprocessor , 2012, 2012 Proceedings of the 35th International Convention MIPRO.

[5]  John L. Gustafson,et al.  Reevaluating Amdahl's law , 1988, CACM.

[6]  Sasko Ristov,et al.  Virtualized environments in cloud can have superlinear speedup , 2012, BCI '12.

[7]  Sasko Ristov,et al.  Superlinear speedup for matrix multiplication , 2012, Proceedings of the ITI 2012 34th International Conference on Information Technology Interfaces.

[8]  Calton Pu,et al.  An Analysis of Performance Interference Effects in Virtual Environments , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[9]  Yuan Shi Reevaluating Amdahl's Law and Gustafson's Law , 1996 .

[10]  Tor Sørevik,et al.  Nested parallelism: Allocation of threads to tasks and OpenMP implementation , 2001, Sci. Program..

[11]  Mariana Luderitz Kolberg,et al.  Improving the Performance of a Verified Linear System Solver Using Optimized Libraries and Parallel Computation , 2008, VECPAR.