The benefits of service choreography for data-intensive computing

As the number of services and the size of data involved in workflows increases, centralised orchestration techniques are reaching the limits of scalability. In the classic orchestration model, all data pass through a centralised engine, which results in unnecessary data transfer, wasted bandwidth and the engine to become a bottleneck to the execution of a workflow. Choreography techniques, although more complex to model offer a decentralised alternative and are the optimal architecture for data-centric workflows; data are passed directly to where they are required, at the next service in the workflow. While orchestration is the dominant architectural approach, there are relatively few choreography languages and even fewer concrete implementations. This papers contributions are twofold. Firstly we argue the case for choreography in data-intensive computing, and demonstrate through workflow patterns the advantages in terms of scalability when a choreography architecture is adopted. Secondly we introduce the Light Weight Coordination Calculus (LCC), a type of process calculus used to formally define choreographies, and the OpenKnowledge framework, a choreography-based architecture, providing the functionality for peers to coordinate in an open peer-to-peer system. Through LCC and the OpenKnowledge framework we practically demonstrate how choreography can be achieved in a lightweight manner with a comparatively simple process language.

[1]  Yaron Goland,et al.  Web Services Business Process Execution Language , 2009, Encyclopedia of Database Systems.

[2]  Marlon Dumas,et al.  Maestro for Let's Dance: An Environment for Modeling Service Interactions , 2006, BPM Demos.

[3]  Francisco Curbera,et al.  Web Services Business Process Execution Language Version 2.0 , 2007 .

[4]  Jesús Labarta,et al.  Implementing phylogenetic inference with GRID superscalar , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[5]  Wolfgang Reisig,et al.  Analyzing BPEL4Chor: Verification and Participant Synthesis , 2007, WS-FM.

[6]  Lars-Åke Fredlund Implementing WS-CDL , 2006 .

[7]  Gregor von Laszewski,et al.  GSFL: A Workflow Framework for Grid Services , 2002 .

[8]  R. Siebes,et al.  Adaptive routing in structured peer-to-peer overlays , 2007 .

[9]  Hongbing Wang,et al.  WS-CDL+: An Extended WS-CDL Execution Engine for Web Service Collaboration , 2007, IEEE International Conference on Web Services (ICWS 2007).

[10]  Jano I. van Hemert,et al.  Eliminating the middleman: peer-to-peer dataflow , 2008, HPDC '08.

[11]  Jano I. van Hemert,et al.  Scientific Workflow: A Survey and Research Directions , 2007, PPAM.

[12]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[13]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2003, Distributed and Parallel Databases.

[14]  Frank Leymann,et al.  A Novel Approach to Decentralized Workflow Enactment , 2008, 2008 12th International IEEE Enterprise Distributed Object Computing Conference.

[15]  Hagen Overdick,et al.  On the Suitability of WS-CDL for Choreography Modeling , 2006, EMISA.

[16]  Mathias Weske,et al.  BPEL4Chor: Extending BPEL for Modeling Choreographies , 2007, IEEE International Conference on Web Services (ICWS 2007).

[17]  Boi Faltings,et al.  Decentralized Orchestration of CompositeWeb Services , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[18]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.

[19]  Frank van Harmelen,et al.  Models of Interaction as a Grounding for Peer to Peer Knowledge Sharing , 2008, Advances in Web Semantics I.

[20]  Marlon Dumas,et al.  Service Interaction Patterns , 2005, Business Process Management.

[21]  Frank van Harmelen,et al.  The OpenKnowledge System: An Interaction-Centered Approach to Knowledge Sharing , 2007, OTM Conferences.

[22]  S. Ross-Talbot Orchestration and Choreography : Standards , Tools and Technologies for Distributed Workflows , 2005 .

[23]  D. Hollingsworth The workflow Reference Model , 1994 .

[24]  Jano I. van Hemert,et al.  Orchestrating Data-Centric Workflows , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[25]  Mike Jackson,et al.  Introduction to OGSA-DAI Services , 2004, SAG.

[26]  Marlon Dumas,et al.  Let's Dance: A Language for Service Behavior Modeling , 2006, OTM Conferences.

[27]  Mathias Weske,et al.  Formalizing Service Interactions , 2006, Business Process Management.

[28]  Nardine Osman,et al.  Run-time model checking of interaction and deontic models for multi-agent systems , 2006, AAMAS '06.

[29]  Ian J. Taylor,et al.  Distributed P2P computing within Triana: a galaxy visualization test case , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[30]  D. Katz,et al.  The Montage architecture for grid-enabled science processing of large, distributed datasets , 2004 .