Debugging Scientific Workflows with Provenance: Achievements and Lessons Learned

1Scientific Workflow Management Systems manage experiments in large-scale and deliver provenance data. Provenance data represents the workflow execution behavior, allowing for tracing the data-flow generation. When provenance is extended with performance execution data, it becomes an important asset to identify and analyze errors that occurred during the workflow execution (i.e. debugging). Debugging is essential for workflows that execute in parallel in large-scale distributed environments since the incidence of errors in this type of execution is high and difficult to track. By debugging at runtime, scientists can identify errors and take the necessary actions, while the workflow is still running. We present provenance based debugging, in real use cases, running in parallel, with virtual machines in clouds. In these experiences scientists use provenance data to query domain and execution data to detect errors, especially when the execution concludes without any system error message.

[1]  Anthony J. G. Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery [Point of View] , 2011 .

[2]  Marta Mattoso,et al.  Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows , 2013, Future Gener. Comput. Syst..

[3]  Marta Mattoso,et al.  Algebraic dataflows for big data analysis , 2013, 2013 IEEE International Conference on Big Data.

[4]  Marta Mattoso,et al.  SciCumulus: A Lightweight Cloud Middleware to Explore Many Task Computing Paradigm in Scientific Workflows , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[5]  Norman W. Paton,et al.  Adaptive Workflow Processing and Execution in Pegasus , 2008, 2008 The 3rd International Conference on Grid and Pervasive Computing - Workshops.

[6]  Daniel S. Katz,et al.  Turbine: a distributed-memory dataflow engine for extreme-scale many-task applications , 2012, SWEET '12.

[7]  D. Martin Swany,et al.  Online workflow management and performance analysis with Stampede , 2011, 2011 7th International Conference on Network and Service Management.

[8]  Marta Mattoso,et al.  MTCProv: a practical provenance query framework for many-task scientific computing , 2012, Distributed and Parallel Databases.

[9]  Marta Mattoso,et al.  An algebraic approach for data-centric scientific workflows , 2011, Proc. VLDB Endow..

[10]  Marta Mattoso,et al.  Exploring Large Scale Receptor-Ligand Pairs in Molecular Docking Workflows in HPC Clouds , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[11]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[12]  Marta Mattoso,et al.  Using Domain-Specific Data to Enhance Scientific Workflow Steering Queries , 2012, IPAW.

[13]  Marta Mattoso,et al.  User-steering of HPC workflows: state-of-the-art and future directions , 2013, SWEET '13.

[14]  Verena Kantere,et al.  Managing scientific data , 2010, Commun. ACM.

[15]  Jennifer Widom,et al.  Provenance-Based Debugging and Drill-Down in Data-Oriented Workflows , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[16]  Ian J. Taylor,et al.  A Case Study into Using Common Real-Time Workflow Monitoring Infrastructure for Scientific Workflows , 2013, Journal of Grid Computing.

[17]  Marta Mattoso,et al.  SciPhy: A Cloud-Based Workflow for Phylogenetic Analysis of Drug Targets in Protozoan Genomes , 2011, BSB.

[18]  Marta Mattoso,et al.  Supporting dynamic parameter sweep in adaptive and user-steered workflow , 2011, WORKS '11.

[19]  Cláudio T. Silva,et al.  Provenance for Computational Tasks: A Survey , 2008, Computing in Science & Engineering.

[20]  Marta Mattoso,et al.  Capturing and querying workflow runtime provenance with PROV: a practical approach , 2013, EDBT '13.

[21]  Moustafa Ghanem,et al.  Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support , 2012, BMC Bioinformatics.