Throughput based temporal verification for monitoring large batch of parallel processes

On-time completion is one of the most important QoS (Quality of Service) dimensions for business processes running in the cloud. While today’s business systems often need to handle thousands of concurrent user requests, process monitoring is basically conducted in a one by one fashion. It is possible to repeat the strategies for monitoring a single process a thousand times to monitor a thousand parallel processes. However, the time overhead will be a thousand-fold increase as well, which brings a big challenge for process monitoring. In this paper, based on a novel runtime throughput consistency model, we propose a QoS-aware throughput based checkpoint selection strategy which can dynamically select a small number of checkpoints along the system timeline to facilitate the temporal verification of throughput constraints and achieve the target on-time completion rate. The experimental results demonstrate that our strategy can achieve the best efficiency and effectiveness compared with the state-of-the-art as well as other representative response-time based checkpoint selection strategies.

[1]  Geoffrey C. Fox,et al.  Distributed and Cloud Computing: From Parallel Processing to the Internet of Things , 2011 .

[2]  Maria E. Orlowska,et al.  On Modeling and Verification of Temporal Constraints in Production Workflows , 1999, Knowledge and Information Systems.

[3]  Gustavo Alonso,et al.  Exception Handling in Workflow Management Systems , 2000, IEEE Trans. Software Eng..

[4]  Xiao Liu,et al.  A novel general framework for automatic and cost-effective handling of recoverable temporal violations in scientific workflow systems , 2011, J. Syst. Softw..

[5]  Jie Xu,et al.  Concurrent Exception Handling and Resolution in Distributed Object Systems , 2000, IEEE Trans. Parallel Distributed Syst..

[6]  Jelena V. Misic,et al.  Performance Analysis of Cloud Computing Centers Using M/G/m/m+r Queuing Systems , 2012, IEEE Transactions on Parallel and Distributed Systems.

[7]  Hai Jin,et al.  ServiceFlow: QoS-based hybrid service-oriented grid workflow system , 2009, The Journal of Supercomputing.

[8]  Xiao Liu,et al.  A probabilistic strategy for temporal constraint management in scientific workflow systems , 2011, Concurr. Comput. Pract. Exp..

[9]  Yun Yang,et al.  Temporal QOS Management in Scientific Cloud Workflow Systems , 2012 .

[10]  Carlo Ghezzi,et al.  Managing non-functional uncertainty via model-driven adaptivity , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[11]  Xiao Liu,et al.  A Novel Deadline Assignment Strategy for a Large Batch of Parallel Tasks with Soft Deadlines in the Cloud , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[12]  Jinjun Chen,et al.  Dynamic verification of temporal constraints on-the-fly for workflow systems , 2004, 11th Asia-Pacific Software Engineering Conference.

[13]  Yuan-Chun Jiang,et al.  Preventing Temporal Violations in Scientific Workflows: Where and How , 2011, IEEE Transactions on Software Engineering.

[14]  D. Janaki Ram,et al.  Optimizing Ordered Throughput Using Autonomic Cloud Bursting Schedulers , 2013, IEEE Transactions on Software Engineering.

[15]  Jinjun Chen,et al.  Temporal dependency based checkpoint selection for dynamic verification of fixed-time constraints in grid workflow systems , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[16]  Jinjun Chen,et al.  Multiple states based temporal consistency for dynamic verification of fixed‐time constraints in Grid workflow systems , 2007, Concurr. Comput. Pract. Exp..

[17]  Alexandru Iosup,et al.  Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing , 2011, IEEE Transactions on Parallel and Distributed Systems.

[18]  Dennis Gannon,et al.  Scientific versus Business Workflows , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[19]  C. Harris,et al.  Fundamentals of Queueing Theory, Fourth Edition , 2008 .

[20]  Xiao Liu,et al.  Achieving On-Time Delivery: A Two-Stage Probabilistic Scheduling Strategy for Software Projects , 2009, ICSP.

[21]  Jinjun Chen,et al.  Adaptive selection of necessary and sufficient checkpoints for dynamic verification of temporal constraints in grid workflow systems , 2007, TAAS.

[22]  Eddie Schwalb,et al.  Temporal Constraints: A Survey , 1998, Constraints.

[23]  Leon J. Osterweil,et al.  Exception Handling Patterns for Process Modeling , 2010, IEEE Transactions on Software Engineering.

[24]  Xiao Liu,et al.  The Design of Cloud Workflow Systems , 2012, SpringerBriefs in Computer Science.

[25]  Odej Kao,et al.  Exploiting Dynamic Resource Allocation for Efficient Parallel Data Processing in the Cloud , 2011, IEEE Transactions on Parallel and Distributed Systems.

[26]  Hai Zhuge,et al.  A timed workflow process model , 2001, J. Syst. Softw..

[27]  Xiao Liu,et al.  Do we need to handle every temporal violation in scientific workflow systems? , 2014, TSEM.

[28]  Kees M. van Hee,et al.  Workflow Management: Models, Methods, and Systems , 2002, Cooperative information systems.

[29]  Jinjun Chen,et al.  Activity Completion Duration Based Checkpoint Selection for Dynamic Verification of Temporal Constraints in Grid Workflow Systems , 2008, Int. J. High Perform. Comput. Appl..

[30]  Xiao Liu,et al.  Selecting checkpoints along the time line: A novel temporal checkpoint selection strategy for monitoring a batch of parallel business processes , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[31]  Rami Bahsoon,et al.  A decentralized self-adaptation mechanism for service-based applications in the cloud , 2013, IEEE Transactions on Software Engineering.

[32]  Hai Jin,et al.  A throughput maximization strategy for scheduling transaction‐intensive workflows on SwinDeW‐G , 2008, Concurr. Comput. Pract. Exp..