Maximizing reliability while scheduling real-time task-graphs on a cluster of computers

Improper scheduling of real-time applications on a cluster may lead to missing required deadlines and offset the gain of using the system and software parallelism. Most existing scheduling algorithms do not consider factors such as real-time deadlines, system reliability, processing power fragmentation, inter-task communication and degree of parallelism on performance. In this paper we introduce a new scheduling algorithm, which is based on using an objective function to guide the search for a near optimal solution. This objective function includes different criteria such as real-time deadlines, reliability, and quantitative measures of the communication, degree of parallelism and processing power fragmentation. The presence of different criteria may affect the overall acceptance rate of the applications. We also investigate the effect of reliability on the overall acceptance rate.

[1]  Rami G. Melhem,et al.  Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems , 1997, IEEE Trans. Parallel Distributed Syst..

[2]  Atakan Dogan,et al.  Matching and Scheduling Algorithms for Minimizing Execution Time and Failure Probability of Applications in Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[3]  Daniel Mossé,et al.  Real-time scheduling using compact task graphs , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[4]  Isabelle Puaut,et al.  An approach for fault-tolerance in hard real-time distributed systems , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.

[5]  Kang G. Shin,et al.  Assignment and Scheduling Communicating Periodic Tasks in Distributed Real-Time Systems , 1997, IEEE Trans. Software Eng..

[6]  Laurence A. Wolsey,et al.  Integer and Combinatorial Optimization , 1988, Wiley interscience series in discrete mathematics and optimization.

[7]  C. Siva Ram Murthy,et al.  A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis , 1998, IEEE Trans. Parallel Distributed Syst..

[8]  Reda A. Ammar,et al.  Scheduling methods for efficient utilization of cluster computing environments , 2003 .

[9]  Xiao Qin,et al.  RELIABILITY-DRIVEN SCHEDULING FOR REAL-TIME TASKS WITH PRECEDENCE CONSTRAINTS IN HETEROGENEOUS SYSTEMS* * , 2000 .

[10]  Krithi Ramamritham,et al.  Allocation and Scheduling of Precedence-Related Periodic Tasks , 1995, IEEE Trans. Parallel Distributed Syst..

[11]  Yacine Atif,et al.  Dynamic scheduling of real-time aperiodic tasks on multiprocessor architectures , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[12]  P. D. T. O'Connor Introduction to reliability engineering. E. E. Lewis, Wiley, New York, 1987. No. of pages 400. Price: £52.75 (U.K.) , 1987 .