T-REx: Table Repair Explanations

Data repair is a common and crucial step in many frameworks today, as applications may use data from different sources and of different levels of credibility. Thus, this step has been the focus of many works, proposing diverse approaches. To assist users in understanding the output of such data repair algorithms, we propose T-REx, a system for providing data repair explanations through Shapley values. The system is generic and not specific to a given repair algorithm or approach: it treats the algorithm as a black box. Given a specific table cell selected by the user, T-REx employs Shapley values to explain the significance of each constraint and each table cell in the repair of the cell of interest. T-REx then ranks the constraints and table cells according to their importance in the repair of this cell. This explanation allows users to understand the repair process, as well as to act based on this knowledge, to modify the most influencing constraints or the original database.

[1]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[2]  Christopher Ré,et al.  The HoloClean Framework Dataset to be cleaned Denial Constraints External Information t 1 t 4 t 2 t 3 Johnnyo ’ s , 2017 .

[3]  Paolo Papotti,et al.  Discovering Denial Constraints , 2013, Proc. VLDB Endow..

[4]  L. Shapley A Value for n-person Games , 1988 .

[5]  Paolo Papotti,et al.  Holistic data cleaning: Putting violations into context , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[6]  Renée J. Miller,et al.  Continuous data cleaning , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[7]  Wenfei Fan,et al.  Conditional Functional Dependencies for Data Cleaning , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[8]  L. S. Shapley,et al.  17. A Value for n-Person Games , 1953 .

[9]  Erik Strumbelj,et al.  Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.

[10]  Philip Wolfe,et al.  Contributions to the theory of games , 1953 .