The process management component of a scalable systems software environment

The systems software necessary to operate large-scale parallel computers presents a variety of research and development issues. One approach is to consider systems software as a collection of interacting components, with well-defined published interfaces. The scalable systems software SciDAC project is currently exploring the feasibility of architecting systems software this way. In this paper we present a prototype process manager component for such a system. We describe the component abstractly in terms of its functionality and the interface by which its functionality may be invoked. We propose a precise syntax for this interface and describe one implementation of the process manager component, based on an existing scalable process management system called MPD. We conclude with some experiences using this process manager component in conjunction with other systems software components on a medium-sized Linux cluster.