Meta-line: lineage information for improved metadata quality

Controlled content quality also in terms of indexing is one of the major advantages of using digital libraries in contrast to general Web sources or Web search engines. However, considering today's information flood the mostly manual effort in acquiring new sources and creating suitable (semantic) metadata for content indexing and retrieval is already prohibitive. A recent solution is given by automatic generation of metadata, where various methods currently become more widespread. But in this case neglecting quality assurance is even more problematic, because heuristic generation often fails and the resulting low-quality metadata will directly diminish the quality of service that a digital library provides. To address this problem, we propose a metadata quality model to determine the overall quality of a metadata set and validate individual requirements imposed on that metadata set. Furthermore, lineage information is provided to trace the quality evolution of a metadata set.

[1]  Carole A. Goble,et al.  Data Lineage Model for Taverna Workflows with Lightweight Annotation Requirements , 2008, IPAW.

[2]  Jennifer Widom,et al.  Practical lineage tracing in data warehouses , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[3]  James Frew,et al.  Composing lineage metadata with XML for custom satellite-derived data products , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[4]  James Cheney,et al.  Provenance management in curated databases , 2006, SIGMOD Conference.