SwinDeW-C: A Peer-to-Peer Based Cloud Workflow System

Workflow systems are designed to support the process automation of large scale business and scientific applications. In recent years, many workflow systems have been deployed on high performance computing infrastructures such as cluster, peer-to-peer (p2p), and grid computing (Moore, 2004; Wang, Jie, & Chen, 2009; Yang, Liu, Chen, Lignier, & Jin, 2007). One of the driving forces is the increasing demand of large scale instance and data/computation intensive workflow applications (large scale workflow applications for short) which are common in both eBusiness and eScience application areas. Typical examples (will be detailed in Section 13.2.1) include such as the transaction intensive nation-wide insurance claim application process; the data and computation intensive pulsar searching process in Astrophysics. Generally speaking, instance intensive applications are those processes which need to be executed for a large number of times sequentially within a very short period or concurrently with a large number of instances (Liu, Chen, Yang, & Jin, 2008; Liu et al., 2010; Yang et al., 2008). Therefore, large scale workflow applications normally require the support of high performance computing infrastructures (e.g. advanced CPU units, large memory space and high speed network), especially when workflow activities are of data and computation intensive themselves. In the real world, to accommodate such a request, expensive computing infrastructures including such as supercomputers and data servers are bought, installed, integrated and maintained with huge cost by system users

[1]  Ann L. Chervenak,et al.  Data Management Challenges of Data-Intensive Scientific Workflows , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[2]  Xiao Liu,et al.  A Compromised-Time-Cost Scheduling Algorithm in SwinDeW-C for Instance-Intensive Cost-Constrained Workflows on a Cloud Computing Platform , 2010, Int. J. High Perform. Comput. Appl..

[3]  Xiao Liu,et al.  Forecasting Duration Intervals of Scientific Workflow Activities Based on Time-Series Patterns , 2008, 2008 IEEE Fourth International Conference on eScience.

[4]  Xiao Liu,et al.  A Probabilistic Strategy for Setting Temporal Constraints in Scientific Workflows , 2008, BPM.

[5]  Rajkumar Buyya,et al.  Article in Press Future Generation Computer Systems ( ) – Future Generation Computer Systems Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility , 2022 .

[6]  Hai Jin,et al.  Peer-to-Peer Based Grid Workflow Runtime Environment of SwinDeW-G , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[7]  Ian J. Taylor,et al.  Workflows and e-Science: An overview of workflow system features and capabilities , 2009, Future Gener. Comput. Syst..

[8]  Danilo Ardagna,et al.  Adaptive Service Composition in Flexible Processes , 2007, IEEE Transactions on Software Engineering.

[9]  Jinjun Chen,et al.  A taxonomy of grid workflow verification and validation , 2008, Concurr. Comput. Pract. Exp..

[10]  Rajkumar Buyya,et al.  A Taxonomy of Workflow Management Systems for Grid Computing , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[11]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[12]  Miron Livny,et al.  Data placement for scientific applications in distributed environments , 2007, 2007 8th IEEE/ACM International Conference on Grid Computing.

[13]  Xiao Liu,et al.  A cost-effective strategy for intermediate data storage in scientific cloud workflow systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[14]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[15]  Jinjun Chen,et al.  Grid Computing: Infrastructure, Service, and Applications , 2009 .

[17]  Lizhe Wang,et al.  Performance evaluation of virtual machine‐based Grid workflow system , 2008, Concurr. Comput. Pract. Exp..

[18]  G. Bruce Berriman,et al.  On the Use of Cloud Computing for Scientific Workflows , 2008, 2008 IEEE Fourth International Conference on eScience.

[19]  Jinjun Chen,et al.  Multiple states based temporal consistency for dynamic verification of fixed‐time constraints in Grid workflow systems , 2007, Concurr. Comput. Pract. Exp..

[20]  Zhou Lei,et al.  Grid resource allocation , 2009 .

[21]  Vijay Varadharajan,et al.  Enhancing grid security with trust management , 2004, IEEE International Conference onServices Computing, 2004. (SCC 2004). Proceedings. 2004.

[22]  Thomas Erl,et al.  SOA Principles of Service Design , 2007 .

[23]  Jinjun Chen,et al.  Temporal dependency based checkpoint selection for dynamic verification of fixed-time constraints in grid workflow systems , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[24]  Xiao Liu,et al.  A data placement strategy in scientific cloud workflows , 2010, Future Gener. Comput. Syst..

[25]  Renato Figueiredo,et al.  Science Clouds: Early Experiences in Cloud Computing for Scientific Applications , 2008 .

[26]  Rajkumar Buyya,et al.  CloudSim: A Novel Framework for Modeling and Simulation of Cloud Computing Infrastructures and Services , 2009, ArXiv.

[27]  Michelle D. Moore,et al.  An accurate parallel genetic algorithm to schedule tasks on a cluster , 2004, Parallel Comput..

[28]  Sriram Ramabhadran,et al.  Cloud control with distributed rate limiting , 2007, SIGCOMM '07.

[29]  Aaron Weiss,et al.  Can the PC go green? , 2007, NTWK.

[30]  Xiao Liu,et al.  An Algorithm in SwinDeW-C for Scheduling Transaction-Intensive Cost-Constrained Cloud Workflows , 2008, 2008 IEEE Fourth International Conference on eScience.

[31]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[32]  Franck Cappello,et al.  Cost-benefit analysis of Cloud Computing versus desktop grids , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[33]  James Frew,et al.  Lineage retrieval for scientific data processing: a survey , 2005, CSUR.

[34]  Jinjun Chen,et al.  Adaptive selection of necessary and sufficient checkpoints for dynamic verification of temporal constraints in grid workflow systems , 2007, TAAS.

[35]  Douglas Thain,et al.  All-pairs: An abstraction for data-intensive cloud computing , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[36]  Rajkumar Buyya,et al.  Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters , 2009, HPDC '09.

[37]  Kent E. Seamons,et al.  Content-triggered trust negotiation , 2004, TSEC.

[38]  Miron Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[39]  Hai Jin,et al.  A throughput maximization strategy for scheduling transaction‐intensive workflows on SwinDeW‐G , 2008, Concurr. Comput. Pract. Exp..

[40]  Ninghui Li,et al.  Safety in automated trust negotiation , 2004, IEEE Symposium on Security and Privacy, 2004. Proceedings. 2004.

[41]  Paul J. Schweitzer,et al.  Problem Decomposition and Data Reorganization by a Clustering Technique , 1972, Oper. Res..

[42]  Jinjun Chen,et al.  Temporal dependency-based checkpoint selection for dynamic verification of temporal constraints in scientific workflow systems , 2011, TSEM.

[43]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[44]  B. Michelaard The Pegasus Project , 2000 .

[45]  Xiao Liu,et al.  Handling Recoverable Temporal Violations in Scientific Workflow Systems: A Workflow Rescheduling Based Strategy , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[46]  Elisa Bertino,et al.  Trust Negotiation in Identity Management , 2007, IEEE Security & Privacy.

[47]  Roger Smith,et al.  Computing in the Cloud , 2009 .