Automatic capture and efficient storage of e‐Science experiment provenance

For the first provenance challenge, we introduce a layered model to represent workflow provenance that allows navigation from an abstract model of the experiment to instance data collected during a specific experiment run. We outline modest extensions to a commercial workflow engine so it will automatically capture provenance at workflow runtime. We also present an approach to store this provenance data in a relational database. Finally, we demonstrate how core provenance queries in the challenge can be expressed in SQL and discuss the merits of our layered representation. Copyright © 2007 John Wiley & Sons, Ltd.