Speedup and scalability analysis of Master-Slave applications on large heterogeneous clusters

Although cluster environments have an enormous potential processing power, real applications that take advantage of this power remain an elusive goal. This is due, in part, to the lack of understanding about the characteristics of the applications best suited for these environments. This paper focuses on Master/Slave applications for large heterogeneous clusters. It defines application, cluster and execution models to derive an analytic expression for the execution time. It defines speedup and derives speedup bounds based on the inherent parallelism of the application and the aggregated computing power of the cluster. The paper derives an analytical expression for efficiency and uses it to define scalability of the algorithm-cluster combination based on the isoefficiency metric. Furthermore, the paper establishes necessary and sufficient conditions for an algorithm-cluster combination to be scalable which are easy to verify and use in practice. Finally, it covers the impact of network contention as the number of processors grow.

[1]  Edward D. Lazowska,et al.  Speedup Versus Efficiency in Parallel Systems , 1989, IEEE Trans. Computers.

[2]  Lionel M. Ni,et al.  Scalable Problems and Memory-Bounded Speedup , 1993, J. Parallel Distributed Comput..

[3]  Howard Jay Siegel The Top 10 Most Influential Parallel and Distributed Processing Concepts in the Last Millennium , 2000, IPDPS.

[4]  Xian-He Sun Scalability versus Execution Time in Scalable Systems , 2002, J. Parallel Distributed Comput..

[5]  Xian-He Sun,et al.  Performance Modeling and Prediction of Nondedicated Network Computing , 2002, IEEE Trans. Computers.

[6]  Vipin Kumar,et al.  Parallel depth first search. Part II. Analysis , 1987, International Journal of Parallel Programming.

[7]  Laurent Colombet,et al.  Speedup and Efficiency of Large-Size Applications on Heterogeneous Networks , 1998, Theor. Comput. Sci..

[8]  Lui Sha,et al.  What Are the Top Ten Most Influential Parallel and Distributed Processing Concepts of the Past Millenium? , 2001, J. Parallel Distributed Comput..

[9]  Francine Berman,et al.  Program Speedup in a Heterogeneous Computing Network , 1994, J. Parallel Distributed Comput..

[10]  Xiaodong Zhang,et al.  Erratum: "An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW" , 1997, J. Parallel Distributed Comput..

[11]  YONG YAN,et al.  An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW , 1996, J. Parallel Distributed Comput..

[12]  Wolfgang Rönsch,et al.  Scalability of Algorithms: An Analytic Approach , 1995, Parallel Comput..

[13]  C. Murray Woodside,et al.  Evaluating the Scalability of Distributed Systems , 2000, IEEE Trans. Parallel Distributed Syst..

[14]  Thomas G. Robertazzi,et al.  Divisible Load Scheduling for Grid Computing , 2003 .

[15]  Andrew S. Tanenbaum,et al.  Computer Networks , 1981 .

[16]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[17]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[18]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[19]  Keqin Li Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers , 2001, J. Parallel Distributed Comput..

[20]  Xian-He Sun,et al.  Toward a better parallel performance metric , 1991, Parallel Comput..

[21]  Debasish Ghose,et al.  Foreword (Special Issue of Cluster Computing on Divisible Load Scheduling) , 2004, Cluster Computing.

[22]  John L. Gustafson,et al.  Reevaluating Amdahl's law , 1988, CACM.

[23]  Thomas G. Robertazzi,et al.  Ten Reasons to Use Divisible Load Theory , 2003, Computer.

[24]  D. N. Ramos-Hernandez,et al.  Performance evaluation of heterogeneous systems , 2001, Microprocess. Microsystems.

[25]  Francine Berman,et al.  Master/slave computing on the Grid , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[26]  Wei Li,et al.  Performance models for scalable cluster computing , 1997, J. Syst. Archit..

[27]  Xian-He Sun,et al.  Scalability of Parallel Algorithm-Machine Combinations , 1994, IEEE Trans. Parallel Distributed Syst..

[28]  Venkat Thanvantri,et al.  Performance Metrics: Keeping the Focus on , 1996 .

[29]  Marco Aurélio Amaral Henriques,et al.  A method to solve the scalability problem in managing massively parallel processing on the Internet , 1999, Proceedings of the Seventh Euromicro Workshop on Parallel and Distributed Processing. PDP'99.

[30]  Yves Robert,et al.  Matrix Multiplication on Heterogeneous Platforms , 2001, IEEE Trans. Parallel Distributed Syst..

[31]  Giuseppe Serazzi,et al.  Performance evaluation of parallel systems , 1999, Parallel Comput..

[32]  Andrew S. Grimshaw,et al.  The core Legion object model , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[33]  Francine Berman,et al.  A Slowdown Model for Applications Executing on Time-Shared Clusters of Workstations , 2001, IEEE Trans. Parallel Distributed Syst..