The Circulate architecture: avoiding workflow bottlenecks caused by centralised orchestration

As the number of services and the size of data involved in workflows increases, centralised orchestration techniques are reaching the limits of scalability. In the classic orchestration model, all data passes through a centralised engine, which results in unnecessary data transfer, wasted bandwidth and the engine to become a bottleneck to the execution of a workflow.This paper presents and evaluates the Circulate architecture which maintains the robustness and simplicity of centralised orchestration, but facilitates choreography by allowing services to exchange data directly with one another. Circulate could be realised within any existing workflow framework, in this paper, we focus on WS-Circulate, a Web services based implementation.Taking inspiration from the Montage workflow, a number of common workflow patterns (sequence, fan-in and fan-out), input to output data size relationships and network configurations are identified and evaluated. The performance analysis concludes that a substantial reduction in communication overhead results in a 2–4 fold performance benefit across all patterns. An end-to-end pattern through the Montage workflow results in an 8 fold performance benefit and demonstrates how the advantage of using the Circulate architecture increases as the complexity of a workflow grows.

[1]  Sunil Chandra,et al.  Decentralized orchestration of composite web services , 2004, WWW Alt. '04.

[2]  D. Katz,et al.  The Montage architecture for grid-enabled science processing of large, distributed datasets , 2004 .

[3]  Gregor von Laszewski,et al.  GSFL: A Workflow Framework for Grid Services , 2002 .

[4]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[5]  Ian J. Taylor,et al.  Distributed P2P computing within Triana: a galaxy visualization test case , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[6]  Francisco Curbera,et al.  Web Services Business Process Execution Language Version 2.0 , 2007 .

[7]  Madhusudhan Govindaraju,et al.  Investigating the limits of SOAP performance for scientific computing , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[8]  Liang Chen,et al.  Sedna: A BPEL-Based Environment for Visual Scientific Workflow Modeling , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[9]  David Liu Data-flow Distribution in FICAS Service Composition Infrastructure , 2002 .

[10]  Frank Leymann,et al.  A Novel Approach to Decentralized Workflow Enactment , 2008, 2008 12th International IEEE Enterprise Distributed Object Computing Conference.

[11]  Michael J. Lewis,et al.  Differential Deserialization for Optimized SOAP Performance , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[12]  Jano I. van Hemert,et al.  Scientific Workflow: A Survey and Research Directions , 2007, PPAM.

[13]  Lars-Åke Fredlund Implementing WS-CDL , 2006 .

[14]  Mathias Weske,et al.  BPEL4Chor: Extending BPEL for Modeling Choreographies , 2007, IEEE International Conference on Web Services (ICWS 2007).

[15]  Boi Faltings,et al.  Decentralized Orchestration of CompositeWeb Services , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[16]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[17]  Kincho H. Law,et al.  Analysis of integration models for service composition , 2002, WOSP '02.

[18]  Marlon Dumas,et al.  Let's Dance: A Language for Service Behavior Modeling , 2006, OTM Conferences.

[19]  Mike Jackson,et al.  Introduction to OGSA-DAI Services , 2004, SAG.

[20]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[21]  Andrew S. Grimshaw,et al.  Portable run-time support for dynamic object-oriented parallel processing , 1996, TOCS.