Adaptive parallel system as the high performance parallel architecture

An approach for designing a hybrid parallel system that can perform adaptively for different types of parallelism is presented. An adaptive parallel system (APS) is proposed to attain this goal. The APS is constructed by integrating tightly two different types of parallel architectures, i.e., a multiprocessor system and a memory based processor array (MPA), into a single machine. The multiprocessor and the MPA can execute medium to coarse grain parallelism and fine grain data parallelism optimally. One important feature in the APS is that the programming interface is the same as the usual subroutine call mechanism to execute data parallel code on the MPA. Thus the existence of the MPA is transparent to the programmers. This research concerns the design of an underlying base architecture that can be optimally executed for a broad range of applications, from coarse grain to fine grain parallelism. Also the performance model is provided for fair comparison with other approaches. It turns out that the proposed APS can provide significant performance improvement and cost effectiveness for highly parallel applications having a mixed set of parallelisms.

[1]  Howard Jay Siegel,et al.  Instruction execution trade-offs for SIMD vs. MIMD vs. mixed mode parallelism , 1991, [1991] Proceedings. The Fifth International Parallel Processing Symposium.

[2]  Karen H. Warren,et al.  Spilt-Join and Message Passing Programming Models on the BBN TC2OOO , 1991, ICPP.

[3]  R. F. Freund,et al.  Augmenting the Optimal Selection Theory for Superconcurrency , 1992, Proceedings. Workshop on Heterogeneous Processing.

[4]  S.H. Noh,et al.  Improving massively data parallel system performance with heterogeneity , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[5]  Jake K. Aggarwal,et al.  A Sliding Memory Plane Array Processor , 1993, IEEE Trans. Parallel Distributed Syst..

[6]  Maya Gokhale,et al.  Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.

[7]  Chau-Wen Tseng,et al.  Compiler optimizations for eliminating barrier synchronization , 1995, PPOPP '95.

[8]  David J. Lilja,et al.  Efficient execution of parallel applications in multiprogrammed multiprocessor systems , 1996, Proceedings of International Conference on Parallel Processing.