A multi-dimensional classification model for scientific workflow characteristics

Workflows have been used to model repeatable tasks or operations in manufacturing, business process, and software. In recent years, workflows are increasingly used for orchestration of science discovery tasks that use distributed resources and web services environments through resource models such as grid and cloud computing. Workflows have disparate requirements and constraints that affects how they might be managed in distributed environments. In this paper, we present a multi-dimensional classification model illustrated by workflow examples obtained through a survey of scientists from different domains including bioinformatics and biomedical, weather and ocean modeling, astronomy detailing their data and computational requirements. The survey results and classification model contribute to the high level understanding of scientific workflows.

[1]  William T. C. Kramer,et al.  Performance Variability of Highly Parallel Architectures , 2003, International Conference on Computational Science.

[2]  Adam Arbree,et al.  Mapping Abstract Complex Workflows onto Grid Environments , 2003, Journal of Grid Computing.

[3]  Daniel E. Atkins A Report from the U.S. National Science Foundation Blue Ribbon Panel on Cyberinfrastructure , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[4]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2003, Distributed and Parallel Databases.

[5]  Ian Taylor,et al.  Resource management for the Triana peer-to-peer services , 2004 .

[6]  Rajkumar Buyya,et al.  A taxonomy of scientific workflow systems for grid computing , 2005, SGMD.

[7]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[8]  E. Farhi,et al.  Virtual Experiments on the Neutron Science TeraGrid Gateway , 2008 .

[10]  Jeffrey L. Tilson,et al.  MotifNetwork: A Grid-enabled Workflow for High-throughput Domain Analysis of Biological Sequences: Implications for annotation and study of phylogeny, protein interactions, and intraspecies variation , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[11]  Mark Greenwood,et al.  Taverna: lessons in creating a workflow environment for the life sciences: Research Articles , 2006 .

[12]  Alan Blatecky,et al.  MotifNetwork: Genome-Wide Domain Analysis using Grid-enabled Workflows , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[13]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[14]  Robert J. Fowler,et al.  Stateful Grid Resource Selection for Related Asynchronous Tasks , 2008 .

[15]  Rahul Ramachandran,et al.  Service-oriented environments for dynamically interacting with mesoscale weather , 2005, Computing in Science & Engineering.

[16]  Sara J. Graves,et al.  CASA and LEAD: adaptive cyberinfrastructure for real-time multiscale weather forecasting , 2006, Computer.

[17]  Rahul Ramachandran,et al.  Real-time storm detection and weather forecast activation through data mining and events processing , 2008, Earth Sci. Informatics.

[18]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[19]  Margo McCall,et al.  IEEE Computer Society , 2019, Encyclopedia of Software Engineering.

[20]  Lavanya Ramakrishnan,et al.  Grid portals for bioinformatics , 2006 .

[21]  Geoffrey C. Fox,et al.  Report on the 2006 NSF Workshop on Challenges of Scientific Workflows , 2006 .

[22]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.

[23]  Yolanda Gil,et al.  Workshop on the Challenges of Scientific Workflows , 2006 .

[24]  Norman W. Scheffner,et al.  ADCIRC: An Advanced Three-Dimensional Circulation Model for Shelves, Coasts, and Estuaries. Report 1. Theory and Methodology of ADCIRC-2DDI and ADCIRC-3DL. , 1992 .

[25]  Jason Maassen,et al.  Programming Scientific and Distributed Workflow with Triana Services , 2004 .

[26]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004 .

[27]  Lavanya Ramakrishnan,et al.  Realization of Dynamically Adaptive Weather Analysis and Forecasting in LEAD: Four Years Down the Road , 2007, International Conference on Computational Science.

[28]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[29]  Yolanda Gil,et al.  Workflow management in GriPhyN , 2004 .

[30]  Lavanya Ramakrishnan,et al.  Real-time storm surge ensemble modeling in a grid environment , 2006 .