论文信息 - Parallel graph reduction for divide-and-conquer applications -- Part II: program performance

Parallel graph reduction for divide-and-conquer applications -- Part II: program performance

An extensible machine architecture is devised to efficiently support a parallel reduction model of computation. The organisation of the machine is designed to match the behaviour of the application programs. A pilot implementation of the architecture is used to obtain an execution profile of the various applications. These profiles are used with a performance model to calculate optimal schedules of the applications. The resulting speedup figures give an upper bound for the performance gain that may be attained on a full implementation of the architecture. The most important result is that each application allows for a processor utilisation of over 50% to be attained on our parallel architecture.

Willem G. Vree | Pieter H. Hartel

[1] Jacques Cohen,et al. Garbage Collection of Linked Data Structures , 1981, CSUR.

[2] Louis O. Hertzberger,et al. A Distributed Real‐Time Operating System , 1986, Softw. Pract. Exp..

[3] Richard B. Kieburtz,et al. The G-Machine: A Fast, Graph-Reduction Evaluator , 1985, FPCA.

[4] Nils J. Nilsson,et al. Problem-solving methods in artificial intelligence , 1971, McGraw-Hill computer science series.

[5] H. H. Wang,et al. A Parallel Method for Tridiagonal Equations , 1981, TOMS.

[6] D. A. Turner,et al. A new implementation technique for applicative languages , 1979, Softw. Pract. Exp..

[7] Willem G. Vree,et al. Parallel graph reduction for divide-and-conquer applications -- Part I: program transformation , 1988 .

[8] Ronald L. Graham,et al. Bounds for certain multiprocessing anomalies , 1966 .

[9] P. H. Hartel,et al. Performance analysis of storage management in combinator graph reduction , 1989 .

[10] David Turner. Functional programs as executable specifications , 1984, Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences.