Partitioning tasks between a pair of interconnected heterogeneous processors: A case study

With the variety of computer architectures available today, it is often difficult to determine which particular type of architecture will provide the best performance on a given application program. In fact, one type of architecture may be well suited to executing one section of a program while another architecture may be better suited to executing another section of the same program. One potentially promising approach for exploiting the best features of different computer architectures is to partition an application program to simultaneously execute on two or more types of machines interconnected with a high-speed communication network. A fundamental difficulty with this heterogeneous computing, however, is the problem of determining how to partition the application program across the interconnected machines. The goal of this paper is to show how a programmer or a compiler can use a model of a heterogeneous system to determine the machine on which each subtask should be executed. This technique is illustrated with a simple model that relates the relative performance of two heterogeneous machines to the communication time required to transfer partial results across their interconnection network. Experiments with a Connection Machine CM-200 demonstrate how to apply this model to partition two different application programs across the sequential front-end processor and the parallel back-end array.

[1]  R. Sadourny The Dynamics of Finite-Difference Models of the Shallow-Water Equations , 1975 .

[2]  Jon Bentley,et al.  Programming pearls: algorithm design techniques , 1984, CACM.

[3]  Vasiliki Hartonas-Garmhausen,et al.  Distributing the comparison of DNA and protein sequences across heterogeneous supercomputers , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[4]  Milos D. Ercegovac,et al.  Heterogeneity in supercomputer architectures , 1988, Parallel Comput..

[5]  David J. Lilja,et al.  A Multiprocessor Architecture Combining Fine-Grained and Coarse-Grained Parallelism Strategies , 1994, Parallel Comput..

[6]  Virgílio A. F. Almeida,et al.  Cost-performance analysis of heterogeneity in supercomputer architectures , 1990, Proceedings SUPERCOMPUTING '90.

[7]  R. F. Freund,et al.  Optimal selection theory for superconcurrency , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[8]  Sun Microsystems,et al.  RPC: Remote Procedure Call Protocol specification , 1988, RFC.

[9]  D. J. Lilja,et al.  Experiments with a Task Partitioning Model for Heterogeneous Computing , 1993, Proceedings. Workshop on Heterogeneous Processing,.

[10]  JOHN B. ANDREWS,et al.  An Analytical Approach to Performance/Cost Modeling of Parallel Computers , 1991, J. Parallel Distributed Comput..

[11]  H. T. Kung,et al.  The design of nectar: a network backplane for heterogeneous multicomputers , 1989, ASPLOS III.

[12]  Larry D. Wittie,et al.  MERLIN: Massively Parallel Heterogeneous Computing , 1989, ICPP.

[13]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[14]  Alessandro Forin,et al.  Multilanguage Parallel Programming of Heterogeneous Machines , 1988, IEEE Trans. Computers.

[15]  David H. C. Du,et al.  Network Supercomputing: Experiments with a Cray-2 to Cm-2 Hippi Connection , 1992, Proceedings. Workshop on Heterogeneous Processing.

[16]  David J. Lilja Architectural alternatives for exploiting parallelism , 1991 .

[17]  L. W. Tucker,et al.  Architecture and applications of the Connection Machine , 1988, Computer.