Mapping loop algorithms into reconfigurable mesh connected processor array

A method for mapping loop algorithms into mesh processor arrays is presented. The method can automatically detect and utilize parallelism in the algorithm. It is faster than similar methods with such capabilities, and it produces near-optimal solutions in most cases. Thus, it is suitable for mapping algorithms into reconfigurable systems. When the processors have local memories, the search space of the mapping can be reduced by storing data in the local memories, and the method takes advantage of this possibility. A technique to reduce data dependencies by distributing the functions in a loop body to different processors is also discussed. Since data dependencies determine the connections and the search space, this technique is useful in reducing the connections as well as speeding up the mapping. In order to explain the realization issues of the mapping, a novel mesh architecture with enhanced reconfigurable communication is briefly described.<<ETX>>

[1]  Massimo Maresca,et al.  Polymorphic-Torus Network , 1989, IEEE Trans. Computers.

[2]  King-Sun Fu,et al.  Matching Parallel Algorithm and Architecture , 1983, ICPP.

[3]  Francine Berman,et al.  On Mapping Parallel Algorithms into Parallel Architectures , 1987, J. Parallel Distributed Comput..

[4]  Benjamin W. Wah,et al.  The Design of Optimal Systolic Arrays , 1985, IEEE Transactions on Computers.

[5]  D.I. Moldovan,et al.  On the design of algorithms for VLSI systolic arrays , 1983, Proceedings of the IEEE.

[6]  Lawrence Snyder,et al.  Introduction to the configurable, highly parallel computer , 1982, Computer.

[7]  Dan I. Moldovan,et al.  ADVIS: A Software Package for the Design of Systolic Arrays , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[8]  James C. Browne,et al.  Formulation and Programming of Parallel Computations: A Unified Approach , 1985, International Conference on Parallel Processing.

[9]  Miroslaw Malek,et al.  MOPAC: A Partitionable and Reconfigurable Multicomputer Array , 1983, ICPP.

[10]  Dan I. Moldovan,et al.  Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays , 1986, IEEE Transactions on Computers.

[11]  Hungwen Li,et al.  Structured Process: A New Language Attribute for Better Interaction of Parallel Architecture and Algorithm , 1985, ICPP.