Analyzing the Energy and Power Consumption of Remote Memory Accesses in the OpenSHMEM Model

PGAS models like OpenSHMEM provide interfaces to explicitly initiate one-sided remote memory accesses among processes. In addition, the model also provides synchronizing barriers to ensure a consistent view of the distributed memory at different phases of an application. The incorrect use of such interfaces affects the scalability achievable while using a parallel programming model. This study aims at understanding the effects of these constructs on the energy and power consumption behavior of OpenSHMEM applications. Our experiments show that cost incurred in terms of the total energy and power consumed depends on multiple factors across the software and hardware stack. We conclude that there is a significant impact on the power consumed by the CPU and DRAM due to multiple factors including the design of the data transfer patterns within an application, the design of the communication protocols within a middleware, the architectural constraints laid by the interconnect solutions, and also the levels of memory hierarchy within a compute node. This work motivates treating energy and power consumption as important factors while designing compute solutions for current and future distributed systems.

[1]  Rahul Khanna,et al.  RAPL: Memory power estimation and capping , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[2]  Daniel Moss,et al.  Collaborative compiler-os power management for time-sensitive applications , 2002 .

[3]  Richard W. Vuduc,et al.  A Theoretical Framework for Algorithm-Architecture Co-design , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[4]  Gul A. Agha,et al.  Towards optimizing energy costs of algorithms for shared memory architectures , 2010, SPAA '10.

[5]  Torsten Hoefler Software and Hardware Techniques for Power-Efficient HPC Networking , 2010, Computing in Science & Engineering.

[6]  Dong Li,et al.  Hybrid MPI/OpenMP power-aware computing , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[7]  Michael M. Resch,et al.  Tools for High Performance Computing - Proceedings of the 2nd International Workshop on Parallel Tools for High Performance Computing, July 2008, HLRS, Stuttgart , 2008, Parallel Tools Workshop.

[8]  Jack J. Dongarra,et al.  Energy Footprint of Advanced Dense Numerical Linear Algebra Using Tile Algorithms on Multicore Architectures , 2012, 2012 Second International Conference on Cloud and Green Computing.

[9]  Evangelos P. Markatos,et al.  The effects of multiprogramming on barrier synchronization , 1991, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing.

[10]  George Ho,et al.  PAPI: A Portable Interface to Hardware Performance Counters , 1999 .

[11]  Josep Torrellas Thrifty: An Exascale Architecture for Energy Proportional Computing , 2014 .

[12]  Wolfgang E. Nagel,et al.  Power measurement techniques on standard compute nodes: A quantitative comparison , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[13]  Kirk W. Cameron,et al.  Model-Based Hybrid MPI/OpenMP Power-Aware Computing: ACM/IEEE Supercomputing'2009: High-performance Computing, Networking, Storage and Analysis (SC): Poster Session , 2009 .

[14]  Shajulin Benedict,et al.  Energy-aware performance analysis methodologies for HPC architectures - An exploratory study , 2012, J. Netw. Comput. Appl..

[15]  Matthias S. Müller,et al.  The Vampir Performance Analysis Tool-Set , 2008, Parallel Tools Workshop.