Data Quality and Information Compliance

This chapter focuses on the importance of defining data quality expectations and measuring data quality against those expectations. It also focuses on the general perception of data quality, and understanding-based assessment of a savvy manager to distinguish between data cleansing and data quality. In the business intelligence (BI)/data warehouse-user community, there is a growing confusion for making difference between data cleansing and data quality. Although many data cleansing products can help in applying data edits to name and address data or transforming data during an extract/transform/load (ETL) process, there is no persistence in data cleansing. Each time a data warehouse is populated or updated, the same corrections are applied to the same data. Improved data quality is the result of a business improvement process that identifies and eliminates the root causes of bad data. A critical component of improving data quality is to distinguish between “good” (valid) data and “bad” (invalid) data. But because data values appear in many contexts, formats, and frameworks, this simple concept devolves into complicated notions to analyze the validity of data value.