Keynote lecture 2: program analysis and optimization for multi-core computing

As multi-core architectures become ubiquitous in modern computing, large scale scientific applications have to be redesigned to efficiently use the multiple cores and deliver higher performance. One major approach is the automatic detection of parallelism, in which existing conventional sequential programs are translated into parallel programs by optimizing compilers, in order to take advantage of the multiple processors. Optimizing compilers rely upon program analysis techniques to detect data dependences between program statements, perform optimizations, and identify code fragments that can be executed in parallel. In this work we study various program analysis and optimization techniques for multi-core computing and measure their impact in practice. We perform an experimental evaluation of several data dependence tests and program analysis techniques and we compare them in terms of data dependence accuracy, compilation efficiency, effectiveness in parallelization and program execution performance. We run various experiments using the Perfect Club Benchmarks, the SPEC benchmarks, and the scientific library Lapack. We present the measured accuracy of each data dependence test and explain the reasons for inaccuracies. We compare these tests in terms of efficiency and we analyze the tradeoffs between accuracy and efficiency. We also determine the impact of each data dependence test on the total compilation time. Finally, we measure the number of loops parallelized by each test and we compare the execution performance of each benchmark on a multicore architecture.