Dynamic Round-Robin Task Scheduling to Reduce Cache Misses for Embedded Systems

Modern embedded CPU systems rely on a growing number of software features, but this growth increases the memory footprint and increases the need for efficient instruction and data caches. The embedded operating system will often juggle a changing set tasks in a round-robin fashion, which inevitably results in cache misses due to conflicts between different tasks. Our technique reduces cache misses by continuously monitoring CPU cache misses to grade the performance of running tasks. Through a series of step-wise refinements, our software system tunes the round-robin ordering to find a better temporal sequence for the tasks. This tuning is done dynamically during program execution and hence can adapt to changes in work load or external input stimulus. The benefits of this technique are illustrated using an ARM processor running application benchmarks with different cache organizations and round-robin scheduling techniques.

[1]  Michael D. Smith,et al.  Code placement using temporal profile information , 1998 .

[2]  Janche Sang,et al.  An improved computational algorithm for round-robin service , 2003, Proceedings of the 2003 Winter Simulation Conference, 2003..

[3]  Donald E. Knuth The art of computer programming: fundamental algorithms , 1969 .

[4]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[5]  Alan Dain Samples,et al.  Profile-Driven Compilation , 1991 .

[6]  Vernon Rego,et al.  Simulation of large scale networks III: an improved computational algorithm for round-robin service , 2003, WSC '03.

[7]  Karl Pettis,et al.  Profile guided code positioning , 1990, PLDI '90.

[8]  Jang-Soo Lee,et al.  A selective temporal and aggressive spatial cache system based on time interval , 2000, Proceedings 2000 International Conference on Computer Design.

[9]  Michael D. Smith,et al.  Procedure placement using temporal-ordering information , 1999, TOPL.

[10]  David R. Kaeli,et al.  Temporal-based procedure reordering for improved instruction cache performance , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[11]  Sharad Malik,et al.  Automated cache optimizations using CME driven diagnosis , 2000, ICS '00.

[12]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[13]  Josep Torrellas,et al.  Optimizing instruction cache performance for operating system intensive workloads , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.