An Efficient Algorithm for the Optimal Linear Schedule of Uniform Dependence Algorithms

To whom all correspondence should be addressed Abstract Real-time dynamic control of robot manipulators incurs significant computational load, that can be tackled by parallel processing techniques. One of the most promising areas of research in this field is the area of automatic parallelization of sequential algorithms. In this paper we present an efficient algorithm for the parallelization of nested FOR(DO)-loops with uniform data dependencies. Our work is based on the hyperplane concept, i.e. a set of computations which can be executed simultaneously and which lie on a group of parallel hyperplanes. The resulting hyperplane is guaranteed to be optimal. The primary task when dealing with serial algorithms which will be executed in parallel architectures, is finding a time mapping that defines a new execution ordering and enables parallelism. Our objective is to schedule computations in time, for a specific class of algorithms, the uniform dependence algorithms. The method presented here finds the optimal linear mapping, and outperforms all the previously presented methods in terms of time complexity.

[1]  W. Shang,et al.  On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays , 1992, IEEE Trans. Parallel Distributed Syst..

[2]  Zdenek Hanzalek Parallel processing: From applications to systems , 1997 .

[3]  Weijia Shang,et al.  Time Optimal Linear Schedules for Algorithms with Uniform Dependencies , 1991, IEEE Trans. Computers.

[4]  Weijia Shang,et al.  On Loop Transformations for Generalized Cycle Shrinking , 1994, IEEE Trans. Parallel Distributed Syst..

[5]  Weijia Shang,et al.  Independent Partitioning of Algorithms with Uniform Dependencies , 1992, IEEE Trans. Computers.

[6]  Jih-Kwon Peir,et al.  Minimum Distance: A Method for Partitioning Recurrences for Multiprocessors , 1989, IEEE Trans. Computers.

[7]  Richard M. Karp,et al.  The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.

[8]  Zvi M. Kedem,et al.  Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays , 2017, IEEE Trans. Parallel Distributed Syst..

[9]  Leslie Lamport,et al.  The parallel execution of DO loops , 1974, CACM.

[10]  PEIZONG LEE,et al.  Synthesizing Linear Array Algorithms from Nested For Loop Algorithms , 2015, IEEE Trans. Computers.

[11]  Dan I. Moldovan,et al.  Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays , 1986, IEEE Transactions on Computers.

[12]  Jang-Ping Sheu,et al.  Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers , 1994, IEEE Trans. Parallel Distributed Syst..

[13]  Alexander Aiken,et al.  Optimal loop parallelization , 1988, PLDI '88.

[14]  Lionel M. Ni,et al.  Dependence Uniformization: A Loop Parallelization Technique , 1993, IEEE Trans. Parallel Distributed Syst..

[15]  Utpal Banerjee,et al.  Time and Parallel Processor Bounds for Fortran-Like Loops , 1979, IEEE Transactions on Computers.

[16]  Oscar H. Ibarra,et al.  On Mapping Systolic Algorithms onto the Hypercube , 1990, IEEE Trans. Parallel Distributed Syst..