Scientific workflows in data analysis: Bridging expertise across multiple domains

Abstract In this paper, we demonstrate the use of scientific workflows in bridging expertise across multiple domains by re-purposing workflow fragments in the areas of text analysis, image analysis, and analysis of activity in video. We highlight how the reuse of workflows allows scientists to link across disciplines and avail themselves of the benefits of inter-disciplinary research beyond their normal area of expertise. In addition, we present in-depth studies of various tasks, including tasks for text analysis, multimedia analysis involving both images and text, video activity analysis, and analysis of artistic style using deep learning. These tasks show how the re-use of workflow fragments can turn a pre-existing, rudimentary approach into an expert-grade analysis. We also examine how workflow fragments save time and effort while amalgamating expertise in multiple areas such as machine learning and computer vision.

[1]  Yan Liu,et al.  Making data analysis expertise broadly accessible through workflows , 2011, WORKS '11.

[2]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[3]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[4]  Ling-Yu Duan,et al.  ESUR: A system for Events detection in SURveillance video , 2010, 2010 IEEE International Conference on Image Processing.

[5]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[7]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[8]  Amit K. Roy-Chowdhury,et al.  Modeling and recognition of complex multi-person interactions in video , 2010, MPVA '10.

[9]  Samy Bengio,et al.  Modeling individual and group actions in meetings with layered HMMs , 2006, IEEE Transactions on Multimedia.

[10]  Yolanda Gil,et al.  Mind Your Metadata: Exploiting Semantics for Configuration, Adaptation, and Provenance in Scientific Workflows , 2011, SEMWEB.

[11]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[12]  Yolanda Gil,et al.  Detecting common scientific workflow fragments using templates and execution provenance , 2013, K-CAP.

[13]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[14]  Adrian Hilton,et al.  Visual Analysis of Humans - Looking at People , 2013 .

[15]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[16]  Amit K. Roy-Chowdhury,et al.  Wide Area Tracking in Single and Multiple Views , 2011, Visual Analysis of Humans.

[17]  Jake K. Aggarwal,et al.  Human motion analysis: a review , 1997, Proceedings IEEE Nonrigid and Articulated Motion Workshop.

[18]  Yolanda Gil,et al.  Abstract, link, publish, exploit: An end to end framework for workflow sharing , 2017, Future Gener. Comput. Syst..

[19]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[20]  Yolanda Gil,et al.  Time-bound analytic tasks on large datasets through dynamic configuration of workflows , 2013, WORKS@SC.

[21]  Yan Liu,et al.  A Framework for Efficient Data Analytics through Automatic Configuration and Customization of Scientific Workflows , 2011, 2011 IEEE Seventh International Conference on eScience.

[22]  Paul T. Groth,et al.  Expressive Reusable Workflow Templates , 2009, 2009 Fifth IEEE International Conference on e-Science.

[23]  Ricky J. Sethi Towards defining groups and crowds in video using the atomic group actions dataset , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[24]  Yolanda Gil,et al.  Assisting Scientists with Complex Data Analysis Tasks through Semantic Workflows , 2010, AAAI Fall Symposium: Proactive Assistant Agents.

[25]  Tony Pan,et al.  HPC and Grid Computing for Integrative Biomedical Research , 2009, Int. J. High Perform. Comput. Appl..

[26]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interaction , 1999, ICVS.

[27]  Paul M. Thompson,et al.  FragFlow Automated Fragment Detection in Scientific Workflows , 2014, 2014 IEEE 10th International Conference on e-Science.

[28]  Bingbing Ni,et al.  Recognizing pair-activities by causality analysis , 2011, TIST.

[29]  Ralph Bergmann,et al.  Retrieval of Semantic Workflows with Knowledge Intensive Similarity Measures , 2011, ICCBR.

[30]  Marc Spraragen,et al.  Principles for interactive acquisition and validation of workflows , 2010, J. Exp. Theor. Artif. Intell..

[31]  Carole A. Goble,et al.  Seven Bottlenecks to Workflow Reuse and Repurposing , 2005, International Semantic Web Conference.

[32]  Ian H. Witten,et al.  Weka: Practical machine learning tools and techniques with Java implementations , 1999 .

[33]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Yolanda Gil,et al.  Wings for Pegasus: Creating Large-Scale Scientific Applications Using Semantic Representations of Computational Workflows , 2007, AAAI.

[35]  Paul T. Groth,et al.  Wings: Intelligent Workflow-Based Design of Computational Experiments , 2011, IEEE Intelligent Systems.

[36]  Bingbing Ni,et al.  Recognizing human group activities with localized causalities , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Edward A. Lee,et al.  CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2000; 00:1–7 Prepared using cpeauth.cls [Version: 2002/09/19 v2.02] Taverna: Lessons in creating , 2022 .

[38]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[39]  Rama Chellappa,et al.  Machine Recognition of Human Activities: A Survey , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[40]  Yogesh L. Simmhan,et al.  The Open Provenance Model core specification (v1.1) , 2011, Future Gener. Comput. Syst..

[41]  Ricky J. Sethi,et al.  A Review of Physics-based Methods for Group and Crowd Analysis in Computer Vision , 2013 .

[42]  Yolanda Gil,et al.  TellMe: learning procedures from tutorial instruction , 2011, IUI '11.

[43]  Yolanda Gil,et al.  Structured analysis of the ISI Atomic Pair Actions dataset using workflows , 2013, Pattern Recognit. Lett..

[44]  Sean Bechhofer,et al.  The myExperiment Open Repository for Scientific Workflows , 2009 .

[45]  Amit K. Roy-Chowdhury,et al.  Physics-based activity modelling in phase space , 2010, ICVGIP '10.