D-STHARk: Evaluating Dynamic Scheduling of Tasks in Hybrid Simulated Architectures

The emergence of applications that demand to handle efficiently growing amounts of data has stimulated the development of new computing architectures with several Processing Units (PUs), such as CPUs core, graphics processing units (GPUs) and Intel Xeon Phi (MIC). Aiming to better exploit these architectures, recent works focus on proposing novel runtime environments that offer a variety of methods for scheduling tasks dynamically on different PUs. A main limitation of such proposals refers to the constrained system configurations, usually adopted to tune and test the proposals, since setting more complete and diversified evaluation environments is costly. In this context, we present D-STHARk, a GUI tool for evaluating Dynamic Scheduling of Tasks in Hybrid Simulated ARchitectures. D-STHARk provides a complete simulated execution environment that allows evaluating dynamic scheduling strategies on simulated applications and hybrid architectures. We evaluate our tool by simulating the dynamic scheduling strategies presented in [3], using the same architecture and application. D-STHARk was able to achieve the same conclusions originally reported by the authors. Moreover, we performed an experiment varying the number of coprocessors, which was not previously verified due to lack of real architectures, showing that we may reduce the energy consumption, while keeping the same performance.

[1]  Conor McBride Clowns to the left of me, jokers to the right (pearl): dissecting data structures , 2008, POPL '08.

[2]  Jun Kong,et al.  Comparative Performance Analysis of Intel Xeon Phi, GPU, and CPU , 2013, ArXiv.

[3]  Cédric Augonnet,et al.  StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines , 2010 .

[4]  Teresa H. Y. Meng,et al.  Merge: a programming model for heterogeneous multi-core systems , 2008, ASPLOS.

[5]  Cédric Augonnet,et al.  Exploiting the Cell/BE Architecture with the StarPU Unified Runtime System , 2009, SAMOS.

[6]  Joel H. Saltz,et al.  Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems , 2014, 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing.

[7]  Esteban Walter Gonzalez Clua,et al.  Efficient dynamic scheduling of heterogeneous applications in hybrid architectures , 2014, SAC.

[8]  Jun Kong,et al.  An Integrative Approach for In Silico Glioma Research , 2010, IEEE Transactions on Biomedical Engineering.

[9]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[10]  Bruno Raffin,et al.  Preliminary Experiments with XKaapi on Intel Xeon Phi Coprocessor , 2013, 2013 25th International Symposium on Computer Architecture and High Performance Computing.

[11]  James Reinders,et al.  Intel Xeon Phi Coprocessor High Performance Programming , 2013 .

[12]  Mark Silberstein,et al.  PTask: operating system abstractions to manage GPUs as compute devices , 2011, SOSP.

[13]  Gagan Agrawal,et al.  Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations , 2010, ICS '10.

[14]  Robert D. Blumofe,et al.  Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[15]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[16]  Warren Smith,et al.  Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance , 1999, JSSPP.

[17]  D. I. George Amalarethinam,et al.  A new DAG based Dynamic Task Scheduling Algorithm (DYTAS) for Multiprocessor Systems , 2011 .