Supporting the Production of High-Quality Data in Concurrent Plant Engineering Using a MetaDataRepository

REQUIRED) In recent years, several process models for data quality management have been proposed. As data quality problems are highly application-specific, these models have to remain abstract. This leaves the question of what to do exactly in a given situation unanswered. The task of implementing a data quality process is usually delegated to data quality experts. To do so, they rely heavily on input from domain experts, especially regarding data quality rules. However, in large engineering projects, the number of rules is very large and different domain experts might have different data quality needs. This considerably complicates the task of the data quality experts. Nevertheless, the domain experts need quality measures to support their decision-making process what data quality problems to solve most urgently. In this paper, we propose a MetaDataRepository architecture which allows domain experts to model their quality expectations without the help from technical experts. It balances three conflicting goals: non-intrusiveness, simple and easy usage for domain experts and sufficient expressive power to handle most common data quality problems in a large concurrent engineering environment.

[1]  William McMullen,et al.  A Flexible And Generic Data Quality Metamodel , 2007, ICIQ.

[2]  Diane M. Strong,et al.  10 Potholes in the Road to Information Quality , 1997, Computer.

[3]  Carlo Batini,et al.  Data Quality: Concepts, Methodologies and Techniques , 2006, Data-Centric Systems and Applications.

[4]  Mathias Klier,et al.  Metriken zur Bewertung der Datenqualität – Konzeption und praktischer Nutzen , 2008, Informatik-Spektrum.

[5]  Richard Y. Wang,et al.  Data Quality , 2000, Advances in Database Systems.

[6]  Ali Yassine,et al.  Complex Concurrent Engineering and the Design Structure Matrix Method , 2003, Concurr. Eng. Res. Appl..

[7]  Doheon Lee,et al.  A Taxonomy of Dirty Data , 2004, Data Mining and Knowledge Discovery.

[8]  Biren Prasad,et al.  Information Management for Concurrent Engineering: Research Issues , 1993 .

[9]  Carlo Batini,et al.  Data Quality , 2008, Encyclopedia of GIS.

[10]  H. Dubbel Taschenbuch für den maschinenbau , 1924 .

[11]  Wenfei Fan,et al.  Conditional Dependencies: A Principled Approach to Improving Data Quality , 2009, BNCOD.

[12]  José Barateiro,et al.  A Survey of Data Quality Tools , 2005, Datenbank-Spektrum.

[13]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[14]  Biren Prasad System Integration Techniques of Sharing and Collaboration among Work-groups, Computers and Processes , 1999, J. Syst. Integr..

[15]  Diane M. Strong,et al.  Information quality benchmarks: product and service performance , 2002, CACM.

[16]  Marcus Kaiser,et al.  A Procedure to Develop Metrics for Currency and its Application in CRM , 2009, JDIQ.

[17]  Pedro Rangel Henriques,et al.  A Formal Definition of Data Quality Problems , 2005, ICIQ.

[18]  Larry P. English Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits , 1999 .

[19]  Carlo Batini,et al.  Data Quality at a Glance , 2005, Datenbank-Spektrum.

[20]  Markus Helfert,et al.  Proactive data quality management for data warehouse systems , 2002, DMDW.