A science-gateway workload archive application to the self-healing of workflow incidents

Information about the execution of distributed workload is important for studies in computer science and engineering, but workloads acquired at the infrastructure-level reputably lack information about users and application-level middleware. Meanwhile, workloads acquired at science-gateway level contain detailed information about users, pilot jobs, task sub-steps, bag of tasks and workflow executions. In this work, we present a science-gateway archive, we illustrate its possibilities on a few case studies, and we use it for the autonomic handling of workflow incidents.

[1]  Tristan Glatard,et al.  Self-Healing of Operational Workflow Incidents on Distributed Computing Infrastructures , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[2]  Alexandru Iosup,et al.  The Characteristics and Performance of Groups of Jobs in Grids , 2007, Euro-Par.

[3]  Johan Montagnat,et al.  Multi-infrastructure workflow execution for medical simulation in the Virtual Imaging Platform , 2011 .

[4]  Alexandru Iosup,et al.  The Grid Workloads Archive , 2008, Future Gener. Comput. Syst..

[5]  Michèle Sebag,et al.  The Grid Observatory , 2011, CCGRID.

[6]  Johan Montagnat,et al.  Flexible and Efficient Workflow Deployment of Data-Intensive Applications On Grids With MOTEUR , 2008, Int. J. High Perform. Comput. Appl..

[7]  E. Lanciotti,et al.  DIRAC3 – the new generation of the LHCb grid software , 2009 .

[8]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.