The MPO API: A tool for recording scientific workflows

Abstract Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for high-consequence applications. The Metadata, Provenance and Ontology (MPO) project builds on previous work [M. Greenwald, Fusion Eng. Des. 87 (2012) 2205–2208] and is focused on providing documentation of workflows, data provenance and the ability to data-mine large sets of results. While there are important design and development aspects to the data structures and user interfaces, we concern ourselves in this paper with the application programming interface (API) – the set of functions that interface with the data server. Our approach for the data server is to follow the Representational State Transfer (RESTful) software architecture style for client–server communication. At its core, the API uses the POST and GET methods of the HTTP protocol to transfer workflow information in message bodies to targets specified in the URL to and from the database via a web server. Higher level API calls are built upon this core API. This design facilitates implementation on different platforms and in different languages and is robust to changes in the underlying technologies used. The command line client implementation can communicate with the data server from any machine with HTTP access.