Job Life Cycle Management Libraries for CMS Workflow Management Projects

Scientific analysis and simulation requires the processing and generation of millions of data samples. These tasks are often comprised of multiple smaller tasks divided over multiple (computing) sites. This paper discusses the Compact Muon Solenoid (CMS) workflow infrastructure, and specifically the Python based workflow library which is used for so called task lifecycle management. The CMS workflow infrastructure consists of three layers: high level specification of the various tasks based on input/output data sets, life cycle management of task instances derived from the high level specification and execution management. The workflow library is the result of a convergence of three CMS sub projects that respectively deal with scientific analysis, simulation and real time data aggregation from the experiment. This will reduce duplication and hence development and maintenance costs.

[1]  Ricky Egeland,et al.  Data transfer infrastructure for CMS data taking , 2009 .

[2]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[3]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2003, Distributed and Parallel Databases.

[6]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[7]  Abe Fettig,et al.  Twisted Network Programming Essentials , 2005 .

[8]  Iosif Legrand,et al.  MonALISA : A Distributed Monitoring Service Architecture , 2003, ArXiv.

[9]  Lars Michael Kristensen,et al.  Coloured Petri Nets and CPN Tools for modelling and validation of concurrent systems , 2007, International Journal on Software Tools for Technology Transfer.

[10]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[11]  D. Spiga,et al.  CRAB: an Application for Distributed Scientific Analysis in Grid Projects , 2008, Grid 2008.

[12]  Wmp Wil van der Aalst,et al.  New YAWL: specifying a workflow reference language using coloured petri nets , 2007 .

[13]  Andrei Tsaregorodtsev,et al.  DIRAC optimized workload management , 2008 .

[14]  Yuyi Guo,et al.  The CMS dataset bookkeeping service , 2008 .

[15]  Federico Carminati,et al.  AliEn: ALICE environment on the GRID , 2008 .

[16]  Wil M.P. van der Aalst,et al.  YAWL: yet another workflow language , 2005, Inf. Syst..

[17]  Geoffrey C. Fox,et al.  Fault-Tolerant Reliable Delivery of Messages in Distributed Publish/Subscribe Systems , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[18]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[19]  Wil M. P. van der Aalst,et al.  newYAWL : Specifying a Workflow Reference Language using Coloured Petri Nets , 2007 .