lognP and log3P: Accurate Analytical Models of Point-to- point Communication in Distributed Systems

-Many existing models of point-to-point communication in distributed systems ignore the impact of memory and middleware. Including such details may make these models impractical. Nonetheless, the growing gap between memory and CPU performance combined with the trend toward large-scale, clustered shared memory platforms implies an increased need to consider the impact of middleware on distributed communication. We present a general software-parameterized model of point-to-point communication for use in performance prediction and evaluation. We illustrate the utility of the model in three ways: 1) to derive a simplified, useful, more accurate model of point-to-point communication in clusters of SMPs, 2) to predict and analyze point-topoint and broadcast communication costs in clusters of SMPs, and 3) to express, compare and contrast existing communication models. Though our methods are general, we present results on several Linux clusters to illustrate practical use on real systems.

[1]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[2]  Alok Aggarwal,et al.  Communication Complexity of PRAMs , 1990, Theor. Comput. Sci..

[3]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[4]  A. J. Hey,et al.  Portability and Performance for Parallel Processing , 1994 .

[5]  Thomas R. Gross,et al.  Optimizing memory system performance for communication in parallel computers , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[6]  Chris J. Scheiman,et al.  LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.

[7]  Anand Sivasubramaniam,et al.  Abstracting network characteristics and locality properties of parallel systems , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[8]  Ramesh Subramonian,et al.  LogP: a practical model of parallel computation , 1996, CACM.

[9]  Mary K. Vernon,et al.  LoPC: modeling contention in parallel algorithms , 1997, PPOPP '97.

[10]  Mary K. Vernon,et al.  Predictive analysis of a wavefront application using LogGP , 1999, PPoPP '99.

[11]  Hugh Garraway Parallel Computer Architecture: A Hardware/Software Approach , 1999, IEEE Concurrency.

[12]  Kees Verstoep,et al.  Fast Measurement of LogP Parameters for Message Passing Platforms , 2000, IPDPS Workshops.

[13]  Xiaodong Zhang,et al.  Memory Hierarchy Considerations for Cost-Effective Cluster Computing , 2000, IEEE Trans. Computers.

[14]  Csaba Andras Moritz,et al.  LoGPC: Modeling Network Contention in Message-Passing Programs , 2001, IEEE Trans. Parallel Distributed Syst..

[15]  Fumihiko Ino,et al.  LogGPS: a parallel computational model for synchronization analysis , 2001, PPoPP '01.

[16]  Rajeev Thakur,et al.  Optimizing noncontiguous accesses in MPI-IO , 2002, Parallel Comput..

[17]  Surendra Byna,et al.  Improving the performance of MPI derived datatypes by optimizing memory-access cost , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[18]  Kirk W. Cameron,et al.  Quantifying locality effect in data access delay: memory logP , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[19]  Bowen Alpern,et al.  The uniform memory hierarchy model of computation , 2005, Algorithmica.