Optimized Parallelization of Loop Nests for Multi-core Array Architecture

Multi-core processors are widely used in high performance computing,however,the parallelization of regular sequential programs and the optimization of running time of loop nests are still challenging issues.We present the dependence analysis of nested loop for tiling in polyhedral model,which makes it possible to automatically transform the sequential code into coarse-grain parallel program.Then a genetic algorithm is introduced to optimize the scheduling of tiled task queue for communication overhead in multi-core array architecture.The simulation of LU decomposition proves that our approach can generate more effective parallel code to improve the data locality and load-balanced execution among cores.