Eden: Simplified Management of Atypical High-Performance Computing Jobs

As multiprocessor and multicore technology becomes prevalent, shared-memory architectures with 1,024 or more processing cores are becoming available for general-purpose applications. As an early operator of such a system, the Remote Data Analysis and Visualization (RDAV) center at the University of Tennessee observed a host of user applications needing to scale up their computation by running many concurrent instances of generic codes. This isn't a typical way of using high-performance computing systems, and naive solutions supporting such needs would cause significant issues that hamper system scalability and stability. The RDAV center's Eden software package helps manage large numbers of concurrent serial jobs with high throughput for any such application. Here, the authors describe the motivation and technical nature of Eden and report representative use cases they've participated in during the past two years.

[1]  James H. Laros,et al.  Shared Libraries on a Capability Class Computer , 2011 .

[2]  Hong Ong,et al.  Middleware in Modern High Performance Computing System Architectures , 2007, International Conference on Computational Science.

[3]  Daniel S. Katz,et al.  Swift: A language for distributed parallel scripting , 2011, Parallel Comput..

[4]  Gregor von Laszewski,et al.  Swift: Fast, Reliable, Loosely Coupled Parallel Computation , 2007, 2007 IEEE Congress on Services (Services 2007).

[5]  Daniel E. Fisher,et al.  EnergyPlus: creating a new-generation building energy simulation program , 2001 .

[6]  Yong Zhao,et al.  Many-task computing for grids and supercomputers , 2008, 2008 Workshop on Many-Task Computing on Grids and Supercomputers.

[7]  Robert P. Anderson,et al.  Maximum entropy modeling of species geographic distributions , 2006 .

[8]  David Abramson,et al.  Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.