Visualization of network data provenance

Visualization facilitates the understanding of scientific data both through exploration and explanation of the visualized data. Provenance also contributes to the understanding of data by containing the contributing factors behind a result. The visualization of provenance, although supported in existing workflow management systems, generally focuses on small (medium) sized provenance data, lacking techniques to deal with big data with high complexity. This paper discusses visualization techniques developed for exploration and explanation of provenance, including layout algorithm, visual style, graph abstraction techniques, and graph matching algorithm, to deal with the high complexity. We demonstrate through application to two extensively analyzed case studies that involved provenance capture and use over three year projects, the first involving provenance of a satellite imagery ingest processing pipeline and the other of provenance in a large-scale computer network testbed.

[1]  David E. Culler,et al.  PlanetLab: an overlay testbed for broad-coverage services , 2003, CCRV.

[2]  Markus Kunde,et al.  Requirements for a Provenance Visualization Component , 2008, IPAW.

[3]  Yogesh L. Simmhan,et al.  The Open Provenance Model core specification (v1.1) , 2011, Future Gener. Comput. Syst..

[4]  Julie Steele,et al.  Designing Data Visualizations , 2011 .

[5]  James Frew,et al.  Composing lineage metadata with XML for custom satellite-derived data products , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[6]  Chip Elliott,et al.  GENI - global environment for network innovations , 2008, LCN.

[7]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[8]  Cláudio T. Silva,et al.  Managing Rapidly-Evolving Scientific Workflows , 2006, IPAW.

[9]  Margo I. Seltzer,et al.  Provenance Map Orbiter: Interactive Exploration of Large Provenance Graphs , 2011, TaPP.

[10]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[11]  Karen Schuchardt,et al.  Multi-scale Science: Supporting Emerging Practice with Semantically Derived Provenance , 2003 .

[12]  Teerawat Issariyakul,et al.  Introduction to Network Simulator NS2 , 2008 .

[13]  Richard R. Brooks,et al.  Assessing the Effect of WiMAX System Parameter Settings on MAC-level Local DoS Vulnerability , 2012 .

[14]  Mohan M. Trivedi,et al.  Graph matching using a direct classification of node attendance , 1996, Pattern Recognit..

[15]  Geoffrey C. Fox,et al.  Twister: a runtime for iterative MapReduce , 2010, HPDC '10.

[16]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[17]  Yogesh L. Simmhan,et al.  Karma2: Provenance Management for Data-Driven Workflows , 2008, Int. J. Web Serv. Res..

[18]  Carole A. Goble,et al.  Using Semantic Web Technologies for Representing E-science Provenance , 2004, SEMWEB.

[19]  Yogesh L. Simmhan,et al.  Towards a Quality Model for Effective Data Selection in Collaboratories , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[20]  Jane Hunter,et al.  Provenance Explorer - Customized Provenance Views Using Semantic Inferencing , 2006, SEMWEB.

[21]  Paulo Pinheiro,et al.  Probe-It! Visualization Support for Provenance , 2007, ISVC.