Compiler optimizations and architecture design issues for multiprocessors (parallel)

The use of multiprocessor architectures and compilation for such architectures to speed up the execution of numerical programs is investigated. Three types of parallelism in programs are exploited: (1) Fine-grain: parallelism at the level of individual machine operations, (2) Loop: parallelism between different iterations of the same loop, (3) Coarse-grain: parallelism between different parts of a program. Several multiprocessor architectures to use these types of parallelism are defined. These architectures are either of shared memory multiprocessor or multiple array processor class. Parallel programs for each architecture are automatically generated from serial codes by a Fortran compiler. The performance of each architecture using different compilation methods for each type of parallelism and each architecture, and for the best use of all types together is studied through simulation. In addition, serial loops in parallel programs are studied to determine why they remain serial, their effect on performance, and whether they can be made parallel by the compiler or by a programmer.