A shape matching approach for scheduling fine-grained parallelism

We present a compilation technique for scheduling parallelism on fine grained asynchronous MIMD systems. The shape scheduling algorithm is introduced that utilizes the flexibility of a MIMD system to exploit parallelism within and across basic blocks. Existing techniques exploit parallelism across basic blocks through speculative execution of instructions and code duplication. Our algorithm overlaps the execution of instructions from different basic blocks through matching the shapes of schedules belonging to these basic blocks. In addition, the shape algorithm can reduce the compilation time by increasing the grain size of schedulable units. Experimental results demonstrate that this technique exploits parallelism effectively and that by increasing the grain size the shape algorithm achieves faster compilation times without any significant reduction in program speedup.