Scalable Optimal Greedy Scheduler for Asymmetric Multi-/Many-Core Processors

Ubiquitous asymmetric multi-core processors such as ARM big.LITTLE combine together cores with different power-performance characteristics on a single chip. Upcoming asymmetric many-core processors are expected to combine hundreds of cores belonging to different types. However, the accompanying task-to-core mapping schedules are the key to achieving the full potential of such processors. Run-time scheduling on asymmetric processors is a much harder problem to solve optimally than scheduling on symmetric processors with equivalent cores. We present the first-ever greedy scheduler to be proven theoretically optimal (under certain constraints) for asymmetric processors. The proposed scheduler, called A-Greedy, improves throughput by 26% and reduces average response time by up to 45% when compared to the default Linux scheduler on ARM big.LITTLE asymmetric multi-core.

[1]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[2]  Anuj Pathania,et al.  Scalable Task Schedulers for Many-Core Architectures , 2018 .

[3]  Hao Yu,et al.  3D Many-Core Microprocessor Power Management by Space-Time Multiplexing Based Demand-Supply Matching , 2015, IEEE Transactions on Computers.

[4]  Anuj Pathania,et al.  Price theory based power management for heterogeneous multi-cores , 2014, ASPLOS.

[5]  Larry Rudolph,et al.  Towards Convergence in Job Schedulers for Parallel Supercomputers , 1996, JSSPP.

[6]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[7]  Diana Marculescu,et al.  Dynamic thread mapping for high-performance, power-efficient heterogeneous many-core systems , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).

[8]  Amit Kumar Singh,et al.  Mapping on multi/many-core systems: Survey of current and emerging trends , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[9]  Andrés Goens,et al.  Analysis of Process Traces for Mapping Dynamic KPN Applications to MPSoCs , 2015, IESS.

[10]  Jaco van de Pol,et al.  Lace: Non-blocking Split Deque for Work-Stealing , 2014, Euro-Par Workshops.

[11]  Vanchinathan Venkataramani,et al.  Hierarchical power management for asymmetric multi-core in dark silicon era , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Muhammad Shafique,et al.  Scalable Dynamic Task Scheduling on Adaptive Many-Core , 2018, 2018 IEEE 12th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC).

[13]  Avelino Francisco Zorzo,et al.  Operating system multilevel load balancing , 2006, SAC '06.

[14]  Radu Prodan,et al.  Superlinear speedup in HPC systems: Why and when? , 2016, 2016 Federated Conference on Computer Science and Information Systems (FedCSIS).

[15]  Christine A. Shoemaker,et al.  Scalable thread scheduling and global power management for heterogeneous many-core architectures , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[16]  Matthew D. Stuber,et al.  Convex and concave relaxations of implicit functions , 2015, Optim. Methods Softw..

[17]  Norman P. Jouppi,et al.  Processor Power Reduction Via Single-ISA Heterogeneous Multi-Core Architectures , 2003, IEEE Computer Architecture Letters.

[18]  Simone Libutti,et al.  Co-scheduling tasks on multi-core heterogeneous systems: An energy-aware perspective , 2016, IET Comput. Digit. Tech..

[19]  Jason Cong,et al.  Energy-efficient scheduling on heterogeneous multi-core architectures , 2012, ISLPED '12.

[20]  Larry Rudolph,et al.  Metrics and Benchmarking for Parallel Job Scheduling , 1998, JSSPP.

[21]  Muhammad Shafique,et al.  Optimal Greedy Algorithm for Many-Core Scheduling , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[22]  Christian Bienia,et al.  PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .

[23]  Jörg Henkel,et al.  Invasive manycore architectures , 2012, 17th Asia and South Pacific Design Automation Conference.