An integrated approach towards aggressive state-tracking migration for maximizing performance benefit in distributed computing

This paper presents a new state-tracking migration scheme that is integrated with aggressive reservation strategies such as immediate restart, greedy backfilling and selective preemption. The main contribution of this paper is an analysis of the effects of three techniques that can be used beyond the conventional migration schemes. Our simulation results suggest that state-tracking migration with selective preemption entirely outperforms the others. We also observe that the overall performance of immediate restart strategy combining to migration can be stably maintained under various job lifetime distributions. Moreover, it is found that performance would be improved by fitting jobs ruled by the immediate restart strategy rather than queued jobs into the void-intervals under the state-tracking migration scheme.

[1]  Srinidhi Varadarajan,et al.  DejaVu: transparent user-level checkpointing, migration and recovery for distributed systems , 2006, SC.

[2]  Kenneth R. Baker,et al.  Principles of Sequencing and Scheduling. New York: John Wiley & Sons , 2009 .

[3]  Mary K. Vernon,et al.  Production job scheduling for parallel shared memory systems , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[4]  Dror G. Feitelson,et al.  Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling , 2001, IEEE Trans. Parallel Distributed Syst..

[5]  Albert Y. Zomaya,et al.  Observations on Using Genetic Algorithms for Dynamic Load-Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[6]  Krithi Ramamritham,et al.  Preemptive Scheduling Under Time and Resource Constraints , 1987, IEEE Transactions on Computers.

[7]  Kenneth R. Baker,et al.  Principles of Sequencing and Scheduling , 2018 .

[8]  Iwao Sasase,et al.  A scheduling algorithm minimizing voids generated by arriving bursts in optical burst switched WDM network , 2002, Global Telecommunications Conference, 2002. GLOBECOM '02. IEEE.

[9]  Roger L. Wainwright,et al.  Dynamic scheduling of computer tasks using genetic algorithms , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[10]  Shanshan Song,et al.  Risk-resilient heuristics and genetic algorithms for security-assured grid job scheduling , 2006, IEEE Transactions on Computers.

[11]  Elvira Albert,et al.  Comparing Cost Functions in Resource Analysis , 2009, FOPARA.

[12]  David Abramson,et al.  Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.

[13]  Chan-Hyun Youn,et al.  Integrated approach towards adaptive state-tracking job migration for maximising performance benefit , 2010 .

[14]  Kenneth C. Sevcik,et al.  Implementing Multiprocessor Scheduling Disciplines , 1997, JSSPP.

[15]  Yijun Xiong,et al.  Control architecture in optical burst-switched WDM networks , 2000, IEEE Journal on Selected Areas in Communications.

[16]  Gabriel Rodríguez,et al.  Performance evaluation of an application-level checkpointing solution on grids , 2010, Future Gener. Comput. Syst..