Banking on decoupling: budget-driven sustainability for HPC applications on auction-based clouds
暂无分享,去创建一个
[1] Jack J. Dongarra,et al. FT-MPI: Fault Tolerant MPI, Supporting Dynamic Applications in a Dynamic World , 2000, PVM/MPI.
[2] Miron Livny,et al. Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System , 1997 .
[3] Georg Stellner,et al. CoCheck: checkpointing and process migration for MPI , 1996, Proceedings of International Conference on Parallel Processing.
[4] Artur Andrzejak,et al. Decision Model for Cloud Computing under SLA Constraints , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.
[5] Harrick M. Vin,et al. Egida: an extensible toolkit for low-overhead fault-tolerance , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[6] Artur Andrzejak,et al. Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.
[7] Bronis R. de Supinski,et al. Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[8] Roy Friedman,et al. Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).
[9] Muli Ben-Yehuda,et al. The Resource-as-a-Service (RaaS) Cloud , 2012, HotCloud.
[10] Justin Y. Shi,et al. Decoupling as a Foundation for Large Scale Parallel Computing , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.
[11] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[12] Asser N. Tantawi,et al. See Spot Run: Using Spot Instances for MapReduce Workflows , 2010, HotCloud.
[13] Abdallah Khreishah,et al. Program Scalability Analysis for HPC Cloud: Applying Amdahl's Law to NAS Benchmarks , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[14] Rajkumar Buyya,et al. Provisioning Spot Market Cloud Resources to Create Cost-Effective Virtual Clusters , 2011, ICA3PP.
[15] Andrew Lumsdaine,et al. The Design and Implementation of Checkpoint/Restart Process Fault Tolerance for Open MPI , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[16] R. Buyya,et al. Comprehensive Statistical Analysis and Modeling of Spot Instances in Public Cloud Environments , 2011 .
[17] Abdallah Khreishah,et al. SpotMPI: A Framework for Auction-Based HPC Computing Using Amazon Spot Instances , 2011, ICA3PP.
[18] Muli Ben-Yehuda,et al. Deconstructing Amazon EC2 Spot Instance Pricing , 2011, CloudCom.
[19] Abdallah Khreishah,et al. Resource Planning for Parallel Processing in the Cloud , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.
[20] Michele Mazzucco,et al. Achieving Performance and Availability Guarantees with Spot Instances , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.
[21] Geoffrey C. Fox,et al. Twister: a runtime for iterative MapReduce , 2010, HPDC '10.
[22] Laxmikant V. Kalé,et al. FTC-Charm++: an in-memory checkpoint-based fault tolerant runtime for Charm++ and MPI , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).
[23] Ronald Minnich,et al. A Network-Failure-Tolerant Message-Passing System for Terascale Clusters , 2002, ICS '02.
[24] Rajkumar Buyya,et al. Reliable Provisioning of Spot Instances for Compute-intensive Applications , 2011, 2012 IEEE 26th International Conference on Advanced Information Networking and Applications.