Malleable applications for scalable high performance computing

Abstract Iterative applications are known to run as slow as their slowest computational component. This paper introduces malleability, a new dynamic reconfiguration strategy to overcome this limitation. Malleability is the ability to dynamically change the data size and number of computational entities in an application. Malleability can be used by middleware to autonomously reconfigure an application in response to dynamic changes in resource availability in an architecture-aware manner, allowing applications to optimize the use of multiple processors and diverse memory hierarchies in heterogeneous environments. The modular Internet Operating System (IOS) was extended to reconfigure applications autonomously using malleability. Two different iterative applications were made malleable. The first is used in astronomical modeling, and representative of maximum-likelihood applications was made malleable in the SALSA programming language. The second models the diffusion of heat over a two dimensional object, and is representative of applications such as partial differential equations and some types of distributed simulations. Versions of the heat application were made malleable both in SALSA and MPI. Algorithms for concurrent data redistribution are given for each type of application. Results show that using malleability for reconfiguration is 10 to 100 times faster on the tested environments. The algorithms are also shown to be highly scalable with respect to the quantity of data involved. While previous work has shown the utility of dynamically reconfigurable applications using only computational component migration, malleability is shown to provide up to a 15% speedup over component migration alone on a dynamic cluster environment. This work is part of an ongoing research effort to enable applications to be highly reconfigurable and autonomously modifiable by middleware in order to efficiently utilize distributed environments. Grid computing environments are becoming increasingly heterogeneous and dynamic, placing new demands on applications’ adaptive behavior. This work shows that malleability is a key aspect in enabling effective dynamic reconfiguration of iterative applications in these environments.

[1]  Robert D. Blumofe,et al.  Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[2]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[3]  Georg Stellner,et al.  CoCheck: checkpointing and process migration for MPI , 1996, Proceedings of International Conference on Parallel Processing.

[4]  Michael R. Shirts,et al.  Atomistic protein folding simulations on the submillisecond time scale using worldwide distributed computing. , 2003, Biopolymers.

[5]  Boleslaw K. Szymanski,et al.  The Internet Operating System: Middleware for Adaptive Distributed Computing , 2006, Int. J. High Perform. Comput. Appl..

[6]  Carlos A. Varela,et al.  Load Balancing of Autonomous Actors over Dynamic Networks , 2004, HICSS.

[7]  Gul A. Agha,et al.  ACTORS - a model of concurrent computation in distributed systems , 1985, MIT Press series in artificial intelligence.

[8]  Roy Friedman,et al.  Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[9]  Laxmikant V. Kalé,et al.  Adaptive MPI , 2003, LCPC.

[10]  Zhiling Lan,et al.  Dynamic Load Balancing of SAMR Applications on Distributed Systems , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[11]  Akinori Yonezawa,et al.  Phoenix: a parallel programming model for accommodating dynamically joining/leaving resources , 2003, PPoPP '03.

[12]  Laxmikant V. Kalé,et al.  Adaptive Load Balancing for MPI Programs , 2001, International Conference on Computational Science.

[13]  Henri Casanova,et al.  A Simple MPI Process Swapping Architecture for Iterative Applications , 2004, Int. J. High Perform. Comput. Appl..

[14]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[15]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[16]  Ian T. Foster,et al.  The Globus project: a status report , 1998, Proceedings Seventh Heterogeneous Computing Workshop (HCW'98).

[17]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[18]  Carlos A. Varela,et al.  Programming dynamically reconfigurable open systems with SALSA , 2001, SIGP.

[19]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[20]  Boleslaw K. Szymanski,et al.  An Architecture for Reconfigurable Iterative MPI Applications in Dynamic Environments , 2005, PPAM.

[21]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[22]  Alexander S. Szalay,et al.  The world-wide telescope , 2001, CACM.

[23]  Boleslaw K. Szymanski,et al.  Towards a middleware framework for dynamically reconfigurable scietific computing , 2004, High Performance Computing Workshop.

[24]  Paolo Ciancarini,et al.  Worldwide computing: Adaptive middleware and programming technology for dynamic Grid environments , 2005, Sci. Program..

[25]  Boleslaw K. Szymanski,et al.  Adaptive Computation over Dynamic and Heterogeneous Networks , 2003, PPAM.

[26]  Heidi Jo Newberg,et al.  A Probabilistic Approach to Finding Geometric Objects in Spatial Datasets of the Milky Way , 2005, ISMIS.

[27]  Sathish S. Vadhiyar,et al.  SRS: A Framework for Developing Malleable and Migratable Parallel Applications for Distributed Systems , 2003, Parallel Process. Lett..

[28]  Francine Berman,et al.  The GrADS Project: Software Support for High-Level Grid Application Development , 2001, Int. J. High Perform. Comput. Appl..