Data Profiling Technology of Data Governance Regarding Big Data: Review and Rethinking

Data profiling technology is very valuable for data governance and data quality control because people need it to verify and review the quality of structured, semi-structured, and unstructured data. In this paper, we first review relevant works and discuss their definitions of data profiling. Second, we offer a new definition and propose new classifications for data profiling tasks. Third, the paper presents several free and commercial profiling tools. Fourth, authors offer a new data quality metrics and data quality score calculation. Finally, authors discuss a data profiling tool framework for big data.