Visual Compression of Workflow Visualizations with Automated Detection of Macro Motifs

This paper is concerned with the creation of 'macros' in workflow visualization as a support tool to increase the efficiency of data curation tasks. We propose computation of candidate macros based on their usage in large collections of workflows in data repositories. We describe an efficient algorithm for extracting macro motifs from workflow graphs. We discovered that the state transition information, used to identify macro candidates, characterizes the structural pattern of the macro and can be harnessed as part of the visual design of the corresponding macro glyph. This facilitates partial automation and consistency in glyph design applicable to a large set of macro glyphs. We tested this approach against a repository of biological data holding some 9,670 workflows and found that the algorithmically generated candidate macros are in keeping with domain expert expectations.

[1]  James D. Hollan,et al.  Pad++: a zooming graphical interface for exploring alternate interface physics , 1994, UIST '94.

[2]  F. Schreiber,et al.  MODA: an efficient algorithm for network motif discovery in biological networks. , 2009, Genes & genetic systems.

[3]  William E. Lorensen,et al.  The design and implementation of an object-oriented toolkit for 3D graphics and visualization , 1996, Proceedings of Seventh Annual IEEE Visualization '96.

[4]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2004, IEEE International Parallel and Distributed Processing Symposium.

[5]  Oliver Hofmann,et al.  ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level , 2010, Bioinform..

[6]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[7]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[8]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[9]  Arun Prakash Agrawal,et al.  Comparative analysis of Relational and Graph databases , 2013 .

[10]  Chris Weaver Building Highly-Coordinated Visualizations in Improvise , 2004 .

[11]  Avi Ma'ayan,et al.  SNAVI: Desktop application for analysis and visualization of large-scale signaling networks , 2009, BMC Systems Biology.

[12]  Falk Schreiber,et al.  MAVisto: a tool for the exploration of network motifs , 2005, Bioinform..

[13]  Mong-Li Lee,et al.  NeMoFinder: dissecting genome-wide protein-protein interactions with meso-scale network motifs , 2006, KDD '06.

[14]  Tobias Schreck,et al.  A System for Interactive Visual Analysis of Large Graphs Using Motifs in Graph Editing and Aggregation , 2009, VMV.

[15]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[16]  Juliana Freire,et al.  VisComplete: Automating Suggestions for Visualization Pipelines , 2008, IEEE Transactions on Visualization and Computer Graphics.

[17]  Joshua A. Grochow,et al.  Network Motif Discovery Using Subgraph Enumeration and Symmetry-Breaking , 2007, RECOMB.

[18]  Sahar Asadi,et al.  Kavosh: a new algorithm for finding network motifs , 2009, BMC Bioinformatics.

[19]  James P. Ahrens,et al.  VisMashup: Streamlining the Creation of Custom Visualization Applications , 2009, IEEE Transactions on Visualization and Computer Graphics.

[20]  Fernando M. A. Silva,et al.  g-tries: an efficient data structure for discovering network motifs , 2010, SAC '10.

[21]  Sebastian Wernicke,et al.  FANMOD: a tool for fast network motif detection , 2006, Bioinform..

[22]  Ben Shneiderman,et al.  Motif simplification: improving network visualization readability with fan, connector, and clique glyphs , 2013, CHI.

[23]  Uri Alon,et al.  Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs , 2004, Bioinform..

[24]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[25]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[26]  Ben Shneiderman,et al.  Network Visualization by Semantic Substrates , 2006, IEEE Transactions on Visualization and Computer Graphics.

[27]  Reinhard Schneider,et al.  Using graph theory to analyze biological networks , 2011, BioData Mining.

[28]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[29]  Cláudio T. Silva,et al.  Querying and Creating Visualizations by Analogy , 2007, IEEE Transactions on Visualization and Computer Graphics.

[30]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[31]  Arjan Kuijper,et al.  Visual Analysis of Large Graphs: State‐of‐the‐Art and Future Research Challenges , 2011, Eurographics.

[32]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[33]  Jeffrey Heer,et al.  prefuse: a toolkit for interactive information visualization , 2005, CHI.

[34]  Danny Holten,et al.  Hierarchical Edge Bundles: Visualization of Adjacency Relations in Hierarchical Data , 2006, IEEE Transactions on Visualization and Computer Graphics.

[35]  Jim Davies,et al.  Taxonomy-Based Glyph Design—with a Case Study on Visualizing Workflows of Biological Experiments , 2012, IEEE Transactions on Visualization and Computer Graphics.

[36]  Alexandru Telea,et al.  SMARTLINK: An Agent for Supporting Dataflow Application Construction , 2000, VisSym.

[37]  Chun-Hsi Huang,et al.  Biological network motif detection: principles and practice , 2012, Briefings Bioinform..

[38]  Noga Alon,et al.  Biomolecular network motif counting and discovery by color coding , 2008, ISMB.