Performance Analytic Models and Analyses for Workflow Architectures

The design and implementation of a workflow management system is typically a large and complex task. Decisions need to be made about the hardware and software platforms, the data structures, the algorithms, and network interconnection of various modules utilized by various users and administrators. These decisions are further complicated by requirements such as flexibility, robustness, modifiability, availability, performance, and usability. As the size of workflow systems increases, organizations are finding that the standard server/client architectures, and off-the-shelf solutions are not adequate. We can further see that in the future, very large-scale workflow systems (VLSW) will become more complex, and more prevalent. Thus, one further requirement is an emphasis of this document: scalability. For the purposes of our scalable workflow investigations, we describe a framework, a taxonomy, a model, and a methodology to investigate the performance of various workflow architectures as the size of the system (number of workcases) grows very large.First, this paper presents a novel workflow architectural framework and taxonomy. We survey some example current workflow products and research prototype systems, illustrating some of the taxonomical categories. In fact, most current workflow architectures fall into only one of the many categories of this taxonomy: the centralized server/client category. The paper next explains a performance analysis methodology useful for exploring this taxonomy. The methodology deploys a layered queuing model, and performs mathematical analysis on this model using a modified MOL (method of layers) combined with a linearization algorithm. Finally, the paper utilizes this methodology to compare and contrast the various architectural categories, providing interesting results about performance as the number of workcases increases. Our analytic results suggest that (a) for VLSW performance determination, software architecture is as important as hardware architecture, and (b) alternatives to the client server architecture provide significantly better scalability.