Thread-level Speculative Parallelization

The basic idea under speculative parallelization (also called thread-level speculation) [2, 6, 7] is to assignthe execution of different blocks of consecutive iterations to different threads, running each one on its ownprocessor. While execution proceeds, a software monitor ensures that no thread consumes an incorrect ver-sion of a value that should be calculated by a predecessor, therefore violating sequential semantics. If such adependence violation occur, the monitor stops the parallel execution of the offending threads, discards itera-tions incorrectly calculated, and restart their execution using the correct values. Figure 1 shows an example ofspeculative parallel execution of a loop with dependences.The detection of dependence violations can be done either by hardware or software. Hardware solutions[4, 5] rely on additional hardware modules to detect dependences, while software methods [2, 6, 7] augmentthe original loop with new instructions that check for violations during the parallel execution.The author’s visits to EPCC thanks to the TRACS and HPC-Europa programmes led to a successful collab-oration with Dr. Marcelo Cintra, of the Division of Informatics, in the field of speculative parallelization. Wehave developed a new software-only speculative parallelization engine to automatically execute in parallel se-quential loops with few or no dependences among iterations [1, 2, 3]. The main advantage of this solution is thatit makes possible to parallelize an iterative application automatically by a compiler, thus obtaining speedups ina parallel machine without the cost of a manual parallelization. To do so, the compiler augments the originalcode with function calls to perform accesses to the structure shared among threads, and to monitor the parallelexecution of the loop. The next section discusses the mechanism in more detail.