Data origin or processing information and the metadata that is useful in understanding data can be associated with data by using annotation. Provenance knowledge preserved by annotation is managed by continuously propagating the annotations through the workflow. Models for explicitly associating annotations are generally used for annotation-based provenance management, and techniques for propagating annotations have been proposed. There is also a model for implicitly associating annotations - the annotations are associated with data with arbitrary granularity by using queries. We call the implicit model "multi-granularity annotation" model. Multi-granularity annotation enables flexible association of information. However, no provenance management methods using multi-granularity annotations have been reported. We have developed a method for propagating multigranularity annotations. We define rules for annotation propagation for each relational algebra operation, and they are used to recalculate the scopes of annotations associated with data. We also addressed the loss of information needed to preserve annotation associations during data derivation and the lack of static data annotations by extending the operations and the association method. Experiments showed that our method requires less space usage and execution time than conventional annotation management methods.
[1]
Peter Buneman,et al.
Provenance in databases
,
2009,
SIGMOD '07.
[2]
Jennifer Widom,et al.
Tracing the lineage of view data in a warehousing environment
,
2000,
TODS.
[3]
Bertram Ludäscher,et al.
Provenance in Scientific Workflow Systems
,
2007,
IEEE Data Eng. Bull..
[4]
Sanjeev Khanna,et al.
Edinburgh Research Explorer On the Propagation of Deletions and Annotations through Views
,
2013
.
[5]
Divesh Srivastava,et al.
Intensional associations between data and metadata
,
2007,
SIGMOD '07.
[6]
Floris Geerts,et al.
MONDRIAN: Annotating and Querying Databases through Colors and Blocks
,
2006,
22nd International Conference on Data Engineering (ICDE'06).
[7]
Wang Chiew Tan.
Provenance in Databases: Past, Current, and Future
,
2007,
IEEE Data Eng. Bull..
[8]
Val Tannen,et al.
Provenance semirings
,
2007,
PODS.
[9]
James Cheney,et al.
Provenance in Databases: Why, How, and Where
,
2009,
Found. Trends Databases.
[10]
Wang Chiew Tan,et al.
An annotation management system for relational databases
,
2004,
The VLDB Journal.
[11]
Walid G. Aref,et al.
Supporting annotations on relations
,
2009,
EDBT '09.
[12]
Sanjeev Khanna,et al.
Why and Where: A Characterization of Data Provenance
,
2001,
ICDT.