Data provenance captured from scientific applications is a critical precursor to data sharing and reuse. For researchers wanting to repurpose data, it is a source of information about the lineage and attribution of the data and this is needed in order to establish trust in a data set. Komadu is a standalone provenance capture and visualization system for capturing, representing, and manipulating provenance coming from scientific tools, infrastructures, and repositories. It uses the W3C PROV standard [1] in representing data, and it is the successor of the Karma [2] provenance capture system which was based on Open Provenance Model (OPM) [3]. Komadu comes with two different interfaces: a Web Services interface based on Apache Axis2 [4] and a messaging interface based on RabbitMQ [5]. Komadu is completely open source and the source code is publicly available on GitHub [6]. Even though Komadu has been used most extensively in relation to scientific research, its interfaces are designed to collect and visualize provenance of any kind of application needing provenance.
[1]
Yolanda Gil,et al.
PROV-DM: The PROV Data Model
,
2013
.
[2]
Geoffrey C. Fox,et al.
Twister: a runtime for iterative MapReduce
,
2010,
HPDC '10.
[3]
Yogesh L. Simmhan,et al.
The Open Provenance Model core specification (v1.1)
,
2011,
Future Gener. Comput. Syst..
[4]
Frederic P. Miller,et al.
Apache Maven
,
2010
.
[5]
Sean Bechhofer,et al.
Research Objects: Towards Exchange and Reuse of Digital Knowledge
,
2010
.
[6]
Yogesh L. Simmhan,et al.
A Framework for Collecting Provenance in Data-Centric Scientific Workflows
,
2006,
2006 IEEE International Conference on Web Services (ICWS'06).