Dynamic high-level scripting in parallel applications

Parallel applications typically run in batch mode, sometimes after long waits in a scheduler queue. In some situations, it would be desirable to interactively add new functionality to the running application, without having to recompile and rerun it. For example, a debugger could upload code to perform consistency checks, or a data analyst could upload code to perform new statistical tests. This paper presents a scalable technique to dynamically insert code into running parallel applications. We describe and evaluate an implementation of this idea that allows a user to upload Python code into running parallel applications. This uploaded code will run in concert with the main code. We prove the effectiveness of this technique in two case studies: parallel debugging to support introspection and data analysis of large cosmological datasets.

[1]  Scott Nettles,et al.  Dynamic software updating , 2001, PLDI '01.

[2]  Laxmikant V. Kalé,et al.  Application-specific topology-aware mapping for three dimensional topologies , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[3]  Kurt Stephens XVF: C++ introspection by extensible visitation , 2003, SIGP.

[4]  Oscar Nierstrasz,et al.  On the Revival of Dynamic Languages , 2005, SC@ETAPS.

[5]  Laxmikant V. Kalé,et al.  Massively parallel cosmological simulations with ChaNGa , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[6]  Chadd C. Williams Interactive binary instrumentation , 2004, ICSE 2004.

[7]  John K. Ousterhout,et al.  Scripting: Higher-Level Programming for the 21st Century , 1998, Computer.

[8]  Takashi Masuda,et al.  Designing an Extensible Distributed Language with a Meta-Level Architecture , 1993, ECOOP.

[9]  J. D. Brunner,et al.  VASE: the visualization and application steering environment , 1993, Supercomputing '93.

[10]  Gengbin Zheng,et al.  Achieving High Performance on Extremely Large Parallel Machines: Performance Prediction and Load Balancing , 2005 .

[11]  Laxmikant V. Kalé,et al.  Debugging support for Charm++ , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[12]  Laxmikant V. Kale,et al.  Optimizing Communication for Massively Parallel Processing , 2005 .

[13]  Laxmikant V. Kalé,et al.  Overcoming scaling challenges in biomolecular simulations across multiple platforms , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[14]  Chien-Min Wang,et al.  Non‐intrusive object introspection in C++ , 2002, Softw. Pract. Exp..

[15]  Laxmikant V. Kale,et al.  Performance and Productivity in Parallel Programming via Processor Virtualization , 2004 .

[16]  Laxmikant V. Kalé,et al.  Multiple flows of control in migratable parallel programs , 2006, 2006 International Conference on Parallel Processing Workshops (ICPPW'06).

[17]  Paolo Falcarin,et al.  Software Architecture Evolution through Dynamic AOP , 2004, EWSA.

[18]  Jeffrey K. Hollingsworth,et al.  An API for Runtime Code Patching , 2000, Int. J. High Perform. Comput. Appl..