Job Provenance - Insight into Very Large Provenance Datasets

Following the job-centric monitoring concept, Job Provenance (JP) service organizes provenance records on the per-job basis. It is designed to manage very large number of records, as was required in the EGEE project where it was developed originally. The quantitative aspect is also a focus of the presented demonstration. We show JP capability to retrieve data items of interest from a large dataset of full records of more than 1 million of jobs, to perform non-trivial transformation on those data, and organize the results in such a way that repeated interactive queries are possible. The application area of the demo is derived from that of previous Provenance Challenges. Though the topic of the demo -- a computational experiment -- is arranged rather artificially, the demonstration still delivers its main message that JP supports non-trivial transformations and interactive queries on large data sets.