Conflict-free scheduling of nested loop algorithms on lower dimensional processor arrays

In practice, it is interesting to map n-dimensional algorithms, or algorithms with n nested loops, onto (k-1)-dimensional arrays where k<n. The paper considers some open problems in a previous work by Shang and Fortes (1990). A procedure is proposed to test if or not a given mapping has computational conflicts and a lower bound on the total execution time is provided. Based on the testing procedure and the lower bound, the complexity and the optimality of the optimization procedure in the previous work is improved. The integer programming formulation is also discussed and used to find the optimal time mapping for the 5-dimensional bit level matrix multiplication algorithm into a 2-dimensional bit level processor array.<<ETX>>

[1]  PEIZONG LEE,et al.  Synthesizing Linear Array Algorithms from Nested For Loop Algorithms , 2015, IEEE Trans. Computers.

[2]  Ravi Kannan,et al.  Polynomial Algorithms for Computing the Smith and Hermite Normal Forms of an Integer Matrix , 1979, SIAM J. Comput..

[3]  Dan I. Moldovan,et al.  Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays , 1986, IEEE Transactions on Computers.

[4]  J.A.B. Fortes,et al.  Bit level processor arrays: current architectures and a design and programming tool , 1988, 1988., IEEE International Symposium on Circuits and Systems.

[5]  Weijia Shang,et al.  Time Optimal Linear Schedules for Algorithms with Uniform Dependencies , 1991, IEEE Trans. Computers.

[6]  Thomas Kailath,et al.  Subspace scheduling and parallel implementation of non-systolic regular iterative algorithms , 1989, J. VLSI Signal Process..

[7]  Benjamin W. Wah,et al.  Guest Editors' Introduction: Systolic Arrays-From Concept to Implementation , 1987, Computer.

[8]  Zvi M. Kedem,et al.  Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays , 2017, IEEE Trans. Parallel Distributed Syst..

[9]  Weijia Shang,et al.  Time-Optimal and Conflict-Free Mappings of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays , 1990, ICPP.

[10]  Matthew T. O'Keefe,et al.  A Comparative Study of Two Systematic Design Methodologies for Systolic Arrays , 1986, ICPP.

[11]  Benjamin W. Wah,et al.  The Design of Optimal Systolic Arrays , 1985, IEEE Transactions on Computers.

[12]  W. Shang,et al.  On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays , 1992, IEEE Trans. Parallel Distributed Syst..