Replication Based Job Scheduling in Grids with Security Assurance

Security assurance is a critical requirement for QoS or SLA satisfactions in risky grid environments because jobs may be scheduled to multiple machines across different distributed administrative domains. Unlike conventional methods using fixed-number job replications, in this paper, we propose a security-aware parallel and independent job scheduling algorithm based on adaptive job replications to make sure the job scheduling decision secure, reliable and fault tolerant. In risky and failure- prone grids, the replication number is changed according to the current security conditions and the end-user settings. Simulation results show that it is robust due to its adaptive job replications and rescheduling mechanism. The performance results also show that it is better than non- security-aware scheduling algorithms based on fixed- number job replications.

[1]  Atakan Dogan,et al.  Scheduling of a meta-task with QoS requirements in heterogeneous computing systems , 2006, J. Parallel Distributed Comput..

[2]  Fred B. Schneider,et al.  Byzantine generals in action: implementing fail-stop processors , 1984, TOCS.

[3]  Shanshan Song,et al.  Risk-resilient heuristics and genetic algorithms for security-assured grid job scheduling , 2006, IEEE Transactions on Computers.

[4]  Henri Casanova,et al.  Scheduling distributed applications: the SimGrid simulation framework , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[5]  Shanshan Song,et al.  Trusted Grid Computing with Security Binding and Trust Integration , 2005, Journal of Grid Computing.

[6]  Sanjeev Baskiyar,et al.  Scheduling directed a-cyclic task graphs on a bounded set of heterogeneous processors using task duplication , 2005, J. Parallel Distributed Comput..

[7]  Muthucumaru Maheswaran,et al.  Integrating trust into grid resource management systems , 2002, Proceedings International Conference on Parallel Processing.

[8]  Cevdet Aykanat,et al.  Iterative-Improvement-Based Heuristics for Adaptive Scheduling of Tasks Sharing Files on Heterogeneous Master-Slave Environments , 2006, IEEE Transactions on Parallel and Distributed Systems.

[9]  Cheng Wang,et al.  A Survey of Job Scheduling in Grids , 2007, APWeb/WAIM.

[10]  Anthony A. Maciejewski,et al.  Dynamically mapping tasks with priorities and multiple deadlines in a heterogeneous environment , 2007, J. Parallel Distributed Comput..

[11]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[12]  Liu Kai-peng Trust-driven job scheduling heuristics for computing grid , 2006 .

[13]  Soonwook Hwang,et al.  A Flexible Framework for Fault Tolerance in the Grid , 2003, Journal of Grid Computing.

[14]  Cheng Wang,et al.  A Fuzzy Logic Approach for Secure and Fault Tolerant Grid Job Scheduling , 2007, ATC.

[15]  Congfeng Jiang,et al.  Adaptive Replication Based Security Aware and Fault Tolerant Job Scheduling for Grids , 2007 .

[16]  Carl Kesselman,et al.  Monitoring the grid with the Globus Toolkit MDS4 , 2006 .

[17]  Lee C. Potter,et al.  Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).

[18]  Marty Humphrey,et al.  Security Implications of Typical Grid Computing Usage Scenarios , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[19]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[20]  Jemal H. Abawajy,et al.  Fault-tolerant scheduling policy for grid computing systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[21]  Xiao Qin,et al.  A novel fault-tolerant scheduling algorithm for precedence constrained tasks in real-time heterogeneous systems , 2006, Parallel Comput..

[22]  Kang G. Shin,et al.  Execution Time Analysis of Communicating Tasks in Distributed Systems , 1996, IEEE Trans. Computers.