Data Quality on the Web ( Objectives and Goals of the Seminar )

The aim of this paper is to provide participants of the seminar with some background information on the general notion of data quality (DQ), to illustrate a few agreed-upon and frequently used concepts and definitions, and to detail open problems to be addressed during the seminar. The paper is not meant as complete and comprehensive overview of all aspects related to data quality in the context of databases, information systems or the Web. It is rather intended to provide the participants with a basis for the seminar, outline specific foci, and raise questions and problems to be addressed extensively (and hopefully solved) during the seminar. In the following Section 2, we will summarize some basic settings, definitions, and concepts commonly used in the context of data quality. We will also give some references to relevant literature that discusses these aspects in more detail. In Section 3, we detail a list of questions that arise when data quality is of concern. In Section 4, we then propose some application domains and scenarios in which these questions are to be studied and solutions are to be developed. Both the list of DQ questions and application domains and settings are by no means complete, but should illustrate the depth and breadth we expect data quality aspects in different settings to be covered. In Section 5, we summarize the objectives and outcomes to be taken into account by the working groups.

[1]  Theodore Johnson,et al.  Mining database structure; or, how to build a data quality browser , 2002, SIGMOD '02.

[2]  Veda C. Storey,et al.  A Framework for Analysis of Data Quality Research , 1995, IEEE Trans. Knowl. Data Eng..

[3]  Matthias Jarke,et al.  Systematic Development of Data Mining-Based Data Quality Tools , 2003, VLDB.

[4]  Theodore Johnson,et al.  Data quality and data cleaning: an overview , 2003, SIGMOD '03.

[5]  InduShobha N. Chengalur-Smith,et al.  The Impact of Data Quality Information on Decision Making: An Exploratory Analysis , 1999, IEEE Trans. Knowl. Data Eng..

[6]  C StoreyVeda,et al.  A Framework for Analysis of Data Quality Research , 1995 .

[7]  Diane M. Strong,et al.  10 Potholes in the Road to Information Quality , 1997, Computer.

[8]  Stuart E. Madnick,et al.  Improving the Quality of Corporate Household Data: Current Practices and Research Directions , 2001, IQ.

[9]  Monica Bobrowski,et al.  A Homogeneous Framework to Measure Data Quality , 1999, IQ.

[10]  Bhavani M. Thuraisingham,et al.  Data quality: developments and directions , 2001, IICIS.

[11]  Richard Y. Wang,et al.  Anchoring data quality dimensions in ontological foundations , 1996, CACM.

[12]  Diane M. Strong,et al.  Data quality in context , 1997, CACM.

[13]  Diane M. Strong,et al.  Information quality benchmarks: product and service performance , 2002, CACM.

[14]  Y WangRichard A product perspective on total data quality management , 1998 .

[15]  Richard Y. Wang,et al.  A product perspective on total data quality management , 1998, CACM.

[16]  Sanjeev Khanna,et al.  Why and Where: A Characterization of Data Provenance , 2001, ICDT.

[17]  Michael Gertz,et al.  Managing Data Quality and Integrity in Federated Databases , 1998, IICIS.

[18]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[19]  Daniel Yankelevich,et al.  Quality Mining A Data Mining Based Method for Data Quality Evaluation , 2003 .

[20]  Jack E. Olson,et al.  Data Quality: The Accuracy Dimension , 2003 .

[21]  Felix Naumann,et al.  Quality-Driven Query Answering for Integrated Information Systems , 2002, Lecture Notes in Computer Science.

[22]  Tiziana Catarci,et al.  Managing Data Quality in Cooperative Information Systems , 2002, OTM.

[23]  Markus Helfert,et al.  A Strategy for Managing Data Quality in Data Warehouse Systems , 2001, IQ.

[24]  Amihai Motro,et al.  Estimating the Quality of Databases , 1998, FQAS.

[25]  Richard Y. Wang,et al.  Data quality assessment , 2002, CACM.

[26]  Felix Naumann,et al.  Quality-driven Integration of Heterogenous Information Systems , 1999, VLDB.

[27]  Giri Kumar Tayi,et al.  Enhancing data quality in data warehouse environments , 1999, CACM.

[28]  Barbara Pernici,et al.  Data Quality in Web Information Systems , 2003, J. Data Semant..

[29]  Jennifer Widom,et al.  Practical lineage tracing in data warehouses , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[30]  Maria-Esther Vidal,et al.  Using Quality of Data Metadata for Source Selection and Ranking , 2000, WebDB.

[31]  Stuart E. Madnick,et al.  Data quality requirements analysis and modeling , 2011, Proceedings of IEEE 9th International Conference on Data Engineering.