Automatic Parallelization of Sequential Specifications for Symmetric MPSoCs

This paper presents an embedded system design toolchain for automatic generation of parallel code runnable on symmetric multiprocessor systems from an initial sequential specification written using the C language. We show how the initial C specification is translated in a modified system dependence graph with feedback edges (FSDG) composing the intermediate representation which is manipulated by the algorithm. Then we describe how this graph is partitioned and optimized: at the end of the process each partition (cluster of nodes) represents a different task. The parallel C code produced is such that the tasks can be dynamically scheduled on the target architecture; this is obtained thanks to the introduction of start conditions for each task. We present the experimental results obtained by applying our flow on the sequential code of the ADPCM and JPEG algorithms and by running the parallel specification, produced by the toolchain, on the target platform: with respect to the sequential specification, speedups up to 70% and 42% were obtained for the two bebchmarks respectively.

[1]  Willard Korfhage,et al.  Process scheduling using genetic algorithms , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.

[2]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1988, SIGP.

[3]  Shuvra S. Bhattacharyya,et al.  Efficient techniques for clustering and scheduling onto embedded multiprocessors , 2006, IEEE Transactions on Parallel and Distributed Systems.

[4]  Milind Girkar,et al.  Automatic Extraction of Functional Parallelism from Ordinary Programs , 1992, IEEE Trans. Parallel Distributed Syst..

[5]  J. P. Luis,et al.  Parallelism extraction in acyclic code , 1996, Proceedings of 4th Euromicro Workshop on Parallel and Distributed Processing.

[6]  John Paul Shen,et al.  Automatic partitioning of signal processing programs for symmetric multiprocessors , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.

[7]  Gianluca Palermo,et al.  A design kit for a fully working shared memory multiprocessor on FPGA , 2007, GLSVLSI '07.

[8]  Nirwan Ansari,et al.  A Genetic Algorithm for Multiprocessor Scheduling , 1994, IEEE Trans. Parallel Distributed Syst..

[9]  James C. Browne,et al.  General approach to mapping of parallel computations upon multiprocessor architectures , 1988 .

[10]  Tao Yang,et al.  DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors , 1994, IEEE Trans. Parallel Distributed Syst..

[11]  Vivek Sarkar,et al.  Partitioning and Scheduling Parallel Programs for Multiprocessing , 1989 .