A Multi-level Monitoring Framework for Stream-Based Coordination Programs

Stream-based Coordination is a promising approach to execute programs on parallel hardware such as multi-core systems. It allows to reuse sequential code at component level and to extend such code with concurrency-handling at the coordination level. In this paper we identify the monitoring information required to enable the calculation of performance metrics, automatic load balancing, and bottleneck detection. The monitoring information is obtained by implicitly instrumenting multiple levels: the runtime system and the operating system. We evaluate the monitoring overhead caused by different use cases on S-Net as it is a challenging monitoring benchmark with a flexible and fully asynchronous execution model, including dynamic mapping and scheduling policies. The evaluation shows that in most cases the monitoring causes a negligible overhead of less than five percent.

[1]  Edward D. Lazowska,et al.  Adaptive load sharing in homogeneous distributed systems , 1986, IEEE Transactions on Software Engineering.

[2]  William W. Wadge,et al.  Lucid, the dataflow programming language , 1985 .

[3]  Ce-Kuen Shieh,et al.  Load balancing in distributed shared memory systems , 1997, 1997 IEEE International Performance, Computing and Communications Conference.

[4]  Barton P. Miller,et al.  Performance Measurement for Parallel and Distributed Programs: A Structured and Automatic Approach , 1989, IEEE Trans. Software Eng..

[5]  Kun-Lung Wu,et al.  Bridging concrete and abstract syntaxes in model-driven engineering: a case of rule languages , 2009 .

[6]  Gilles Kahn,et al.  The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.

[7]  Phillip Krueger,et al.  Two adaptive location policies for global scheduling algorithms , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[8]  F. Black,et al.  The Pricing of Options and Corporate Liabilities , 1973, Journal of Political Economy.

[9]  J. Larus,et al.  Shared-memory performance profiling , 1997, PPOPP '97.

[10]  Bernd Mohr,et al.  The Scalasca performance toolset architecture , 2010, Concurr. Comput. Pract. Exp..

[11]  Mukesh Singhal,et al.  Load distributing for locally distributed systems , 1992, Computer.

[12]  Allen D. Malony,et al.  The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..

[13]  Allen D. Malony,et al.  Performance technology for parallel and distributed component software: Research Articles , 2005 .

[14]  Alexander V. Shafarenko,et al.  Message Driven Programming with S-Net: Methodology and Performance , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[15]  Kun-Lung Wu,et al.  Tools and strategies for debugging distributed stream processing applications , 2009, Softw. Pract. Exp..

[16]  Allen D. Malony,et al.  Performance technology for parallel and distributed component software , 2005, Concurr. Pract. Exp..

[17]  Ralph Howard,et al.  Data encryption standard , 1987 .

[18]  Barton P. Miller,et al.  Diagnosing Distributed Systems with Self-propelled Instrumentation , 2008, Middleware.

[19]  Michael Gerndt,et al.  Integrating parallel application development with performance analysis in periscope , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[20]  Kentaro Shimizu,et al.  Adaptive bidding load balancing algorithms in heterogeneous distributed systems , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[21]  Alexander V. Shafarenko,et al.  Asynchronous Stream Processing with S-Net , 2010, International Journal of Parallel Programming.

[22]  Barton P. Miller,et al.  The Paradyn Parallel Performance Measurement Tool , 1995, Computer.

[23]  Michael Allen,et al.  Parallel programming: techniques and applications using networked workstations and parallel computers , 1998 .

[24]  Nathan R. Tallent,et al.  HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..

[25]  Alexander V. Shafarenko,et al.  A Gentle Introduction to S-Net: Typed Stream Processing and Declarative Coordination of Asynchronous Components , 2008, Parallel Process. Lett..

[26]  Thomas Stützle,et al.  Ant Colony Optimization Theory , 2004 .

[27]  Barton P. Miller,et al.  IPS-2: The Second Generation of a Parallel Program Measurement System , 1990, IEEE Trans. Parallel Distributed Syst..

[28]  Raimund Kirner,et al.  Principles of timing anomalies in superscalar processors , 2005, Fifth International Conference on Quality Software (QSIC'05).