Provenance: On and Behind the Screens

Collecting and processing provenance, i.e., information describing the production process of some end product, is important in various applications, e.g., to assess quality, to ensure reproducibility, or to reinforce trust in the end product. In the past, different types of provenance meta-data have been proposed, each with a different scope. The first part of the proposed tutorial provides an overview and comparison of these different types of provenance. To put provenance to good use, it is essential to be able to interact with and present provenance data in a user-friendly way. Often, users interested in provenance are not necessarily experts in databases or query languages, as they are typically domain experts of the product and production process for which provenance is collected (biologists, journalists, etc.). Furthermore, in some scenarios, it is difficult to use solely queries for analyzing and exploring provenance data. The second part of this tutorial therefore focuses on enabling users to leverage provenance through adapted visualizations. To this end, we will present some fundamental concepts of visualization before we discuss possible visualizations for provenance.

[1]  Barbara Tversky,et al.  Animation: can it facilitate? , 2002, Int. J. Hum. Comput. Stud..

[2]  Andreas Buja,et al.  Interactive data visualization using focusing and linking , 1991, Proceeding Visualization '91.

[3]  Deborah L. McGuinness,et al.  Knowledge Provenance Infrastructure , 2003, IEEE Data Eng. Bull..

[4]  Juliana Freire,et al.  Provenance and scientific workflows: challenges and opportunities , 2008, SIGMOD Conference.

[5]  Melanie Herschel A Hybrid Approach to Answering Why-Not Questions on Relational Query Results , 2015, JDIQ.

[6]  Cynthia A. Brewer,et al.  Color use guidelines for data representation , 1999 .

[7]  Heidrun Schumann,et al.  Visualization of Time-Oriented Data , 2011, Human-Computer Interaction Series.

[8]  Daniel Deutch,et al.  Putting Lipstick on Pig: Enabling Database-style Workflow Provenance , 2011, Proc. VLDB Endow..

[9]  Michael Burch,et al.  Visual Adjacency Lists for Dynamic Graphs , 2014, IEEE Transactions on Visualization and Computer Graphics.

[10]  Yogesh L. Simmhan,et al.  The Open Provenance Model core specification (v1.1) , 2011, Future Gener. Comput. Syst..

[11]  Cláudio T. Silva,et al.  Provenance for Computational Tasks: A Survey , 2008, Computing in Science & Engineering.

[12]  Min Chen,et al.  Glyph-based Visualization: Foundations, Design Guidelines, Techniques and Applications , 2013, Eurographics.

[13]  Chris North,et al.  Snap-together visualization: can users construct and operate coordinated visualizations? , 2000, Int. J. Hum. Comput. Stud..

[14]  Jarke J. van Wijk,et al.  Supporting the analytical reasoning process in information visualization , 2008, CHI.

[15]  Michael Burch,et al.  On the Benefits and Drawbacks of Radial Diagrams , 2014, Handbook of Human Centric Visualization.

[16]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[17]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[18]  James Cheney,et al.  The W3C PROV family of specifications for modelling provenance metadata , 2013, EDBT '13.

[19]  Melanie Herschel,et al.  Efficient Computation of Polynomial Explanations of Why-Not Questions , 2015, CIKM.

[20]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[21]  Carole A. Goble,et al.  Using Semantic Web Technologies for Representing E-science Provenance , 2004, SEMWEB.

[22]  Michael Burch,et al.  Visualizing the Evolution of Module Workflows , 2015, 2015 19th International Conference on Information Visualisation.

[23]  Paul T. Groth,et al.  Metadata and provenance management , 2010, Scientific Data Management.

[24]  Alex Endert,et al.  Characterizing Provenance in Visualization and Data Analysis: An Organizational Framework of Provenance Types and Purposes , 2016, IEEE Transactions on Visualization and Computer Graphics.

[25]  James Cheney,et al.  Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[26]  Miryung Kim,et al.  Titian: Data Provenance Support in Spark , 2015, Proc. VLDB Endow..

[27]  James Frew,et al.  Composing lineage metadata with XML for custom satellite-derived data products , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[28]  Melanie Tory,et al.  Visualization task performance with 2D, 3D, and combination displays , 2006, IEEE Transactions on Visualization and Computer Graphics.

[29]  Colin Ware,et al.  Information Visualization: Perception for Design , 2000 .