Fault-aware job scheduling for BlueGene/L systems
暂无分享,去创建一个
[1] Keiji Tani,et al. Job scheduling on the Earth Simulator , 2003 .
[2] Ravishankar K. Iyer,et al. Error/failure analysis using event logs from fault tolerant systems , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.
[3] Attahiru Sule Alfa,et al. Advances in matrix-analytic methods for stochastic models , 1998 .
[4] James S. Plank,et al. Processor Allocation and Checkpoint Interval Selection in Cluster Computing Systems , 2001, J. Parallel Distributed Comput..
[5] Anand Sivasubramaniam,et al. Critical event prediction for proactive management in large-scale computer clusters , 2003, KDD '03.
[6] Richard Wolski,et al. Time Sharing Massively Parallel Machines , 1995, ICPP.
[7] David F. Heidel,et al. An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[8] Daniel P. Siewiorek,et al. A comparative analysis of event tupling schemes , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.
[9] José E. Moreira,et al. Job Scheduling for the BlueGene/L System , 2002, JSSPP.
[10] Kavitha Ranganathan,et al. Decoupling computation and data scheduling in distributed data-intensive applications , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.
[11] Daniel P. Siewiorek,et al. VAX/VMS event monitoring and analysis , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[12] Susanne Albers,et al. Scheduling with unexpected machine breakdowns , 1999, Discret. Appl. Math..
[13] I. Rish,et al. Autonomic Computing Features for Large-scale Server Management and Control , 2003 .
[14] Bala Kalyanasundaram,et al. Fault-tolerant scheduling , 1994, STOC '94.
[15] Xiao Qin,et al. An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems , 2002, Proceedings International Conference on Parallel Processing.
[16] Marios C. Papaefthymiou,et al. Stochastic Analysis of Gang Scheduling in Parallel and Distributed Systems , 1996, Perform. Evaluation.
[17] R. Vilalta,et al. Providing Persistent and Consistent Resources through Event Log Analysis and Predictions for Large-scale Computing Systems , 2002 .
[18] Dror G. Feitelson,et al. Improved Utilization and Responsiveness with Gang Scheduling , 1997, JSSPP.
[19] J. Moreira,et al. An Evaluation of Parallel Job Scheduling for ASCI Blue-Pacific , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[20] Ricardo Vilalta,et al. Predicting rare events in temporal domains , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..
[21] Mark S. Squillante,et al. Modeling and analysis of dynamic coscheduling in parallel and distributed environments , 2002, SIGMETRICS '02.