Revamping the OSCAR database: a flexible approach to cluster configuration data management

The OSCAR (T. Naughton et al., 2002) cluster installation toolkit started life as the result of an ad-hoc working group attempting to bundle a set of "best practices" for building clusters into a single software solution. Although mainly developed as "skunk works" projects in each of the participating institutions, the OSCAR toolkit has gained a large following, boasting hundreds of thousands of downloads and active mailing lists. The original OSCAR toolkit was aimed at one particular type of high performance computing (HPC) cluster. Since then, several sub-projects targeting other types of HPC clusters have been spun off the main working group's efforts. Each of these projects share a core set of OSCAR code, including the OSCAR database and its access API, "ODA" (OSCAR Database API). The ODA abstraction layer - consisting of a database schema and corresponding API - hides a commodity back-end database (e.g., MySQL). As OSCAR and its derivatives are targeted at new, innovative environments (to include non-HPC environments), the configuration management issues that OSCAR must handle have grown exponentially. As such, its current database schema for holding the cluster configuration is starting to show its age - it is simply unable to represent the complex, ever-growing set of data required to accurately describe the clusters that it manages. ODA's API is overly complex, requiring a steep learning curve for OSCAR developers. This paper proposes a simpler, highly flexible design and implementation that allow ODA to not only handle all the data that ODA currently manages, but also allow expansion into new types of clusters, enable storage and retrieval of configuration information in a variety of different formats, and encourage data re-use between the main OSCAR project and its derivative packages.