A technique for summarizing data access and its use in parallelism enhancing transformations

Most existing systems for automatic detection of parallelism concentrate on finding “loop parallelism”, in which the separate iterations of a loop are executed in parallel [2, 7, 111. H owever, there are many programs in which performance improvements can be achieved by also seeking “task parallelism”, such as parallelism between different loop nests or subroutine invocations. In this paper we are concerned with automatically uncovering task parallelism and supporting manual insertion of task parallelism. Although the data dependence graph of a program has proven useful for the detection of loop parallelism, it is not ideal for detecting task parallelism, primarily because the number of data dependences is very large even for programs of moderate size. Uncovering and enhancing task parallelism using the dependence graph can require the examination of every dependence edge between each pair of regions that are candidates for task parallelism an expensive procedure. It would be much more efficient to summarize the effect of all the data dependences between tasks, and use this summary to guide the parallelism detection procedure. In this paper, we present a technique for summarizing the data accesses in a given region and show how this summary can be used to detect and enhance task parallelism in a program. For the sake of simplicity, we restrict our discussion to Fortran programs that consist of a sequence of perfectly-nested loops in which all subroutine calls are expanded inline. However, the