Methodology for Adaptive Active Message Coalescing in Task Based Runtime Systems

Overheads associated with fine grained communication in task based runtime systems are one of the major bottlenecks that limit the performance of distributed applications. In this research, we provide methodology and metrics for analyzing network overheads using the introspection capabilities of HPX, a task based runtime system. We demonstrate that our metrics show a strong correlation with the overall runtime of our test applications. Our aim is to eventually use these metrics to tune, at runtime, parameters relating to active message coalescing. This method improves on the postmortem analysis techniques that are currently employed to tune network settings in distributed applications.

[1]  Torsten Hoefler,et al.  Active pebbles: parallel programming for data-driven applications , 2011, ICS '11.

[2]  Jeanine Cook,et al.  The Performance Implication of Task Size for Applications on the HPX Runtime System , 2015, 2015 IEEE International Conference on Cluster Computing.

[3]  Torsten Hoefler,et al.  AM++: A generalized active message framework , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[4]  Laxmikant V. Kalé,et al.  PICS: a performance-analysis-based introspective control system to steer parallel applications , 2014, ROSS@ICS.

[5]  Richard P. Martin,et al.  Effects Of Communication Latency, Overhead, And Bandwidth In A Cluster Architecture , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[6]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[7]  Thomas L. Sterling,et al.  ParalleX An Advanced Parallel Execution Model for Scaling-Impaired Applications , 2009, 2009 International Conference on Parallel Processing Workshops.

[8]  Seth Copen Goldstein,et al.  Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[9]  Hartmut Kaiser,et al.  HPX: A Task Based Programming Model in a Global Address Space , 2014, PGAS.

[10]  Christopher,et al.  STEllAR-GROUP/hpx: HPX V1.1.0: The C++ Standards Library for Parallelism and Concurrency , 2018 .

[11]  Laxmikant V. Kalé,et al.  TRAM: Optimizing Fine-Grained Communication with Topological Routing and Aggregation of Messages , 2014, 2014 43rd International Conference on Parallel Processing.

[12]  J Liu,et al.  Parquet approximation for the 4x4 Hubbard cluster. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  C. D. Pham Comparison of message aggregation strategies for parallel simulations on a high performance cluster , 2000, Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728).

[14]  Sandeep Koranne,et al.  Boost C++ Libraries , 2011 .