Computational models and resource allocation for supercomputers

There are several different architectures used in supercomputers, with differing computational models. These different models present a variety of resource allocation problems that must be solved. The computational needs of a program must be cast in terms of the computational model supported by the supercomputer, and this must be done in a way that makes effective use of the machine's resources. This is the resource allocation problem. The computational models of available supercomputers and the associated resource allocation techniques are surveyed. It is shown that many problems and solutions appear repeatedly in very different computing environments. Some case studies are presented, sowing concrete computational models and the allocation strategies used. >

[1]  D. Lilja Reducing the Branch Penalty in Pipelined Processors , 1988, Computer.

[2]  Dharma P. Agrawal,et al.  Task Division and Multicomputer Systems , 1985, ICDCS.

[3]  Ken Kennedy,et al.  A Parallel Programming Environment , 1985, IEEE Software.

[4]  James R. McGraw,et al.  The VAL Language: Description and Analysis , 1982, TOPL.

[5]  Dharma P. Agrawal,et al.  Modeling Techniques in a Parallelizing Compiler for the B-Hive Multiprocessor System , 1989, Int. J. High Speed Comput..

[6]  Ja-Song Leu Strategies for retargeting of existing sequential programs for parallel processing , 1987 .

[7]  Richard M. Russell,et al.  The CRAY-1 computer system , 1978, CACM.

[8]  John W. Backus,et al.  Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs , 1978, CACM.

[9]  Ian Kaplan,et al.  Programming the Loral LDF 100 dataflow machine , 1987, SIGP.

[10]  Brian B. Moore,et al.  The IBM System/370 Vector Architecture: Design Considerations , 1988, IEEE Trans. Computers.

[11]  David E. Culler,et al.  Resource requirements of dataflow programs , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[12]  WILLIAM B. ACKERMAN Data flow languages , 1979, 1979 International Workshop on Managing Requirements Knowledge (MARK).

[13]  Dharma P. Agrawal,et al.  Evaluating the performance of multicomputer configurations , 1986 .

[14]  L. W. Tucker,et al.  Architecture and applications of the Connection Machine , 1988, Computer.

[15]  Hongjun Lu,et al.  Load-Balanced Task Allocation in Locally Distributed Computer Systems , 1986, ICPP.

[16]  Shahid H. Bokhari Partitioning Problems in Parallel, Pipelined, and Distributed Computing , 1988, IEEE Trans. Computers.

[17]  R. R. Oldehoeft,et al.  Execution support for HEP SISAL , 1985 .

[18]  D. P. Agrawal Advanced computer architecture : tutorial , 1986 .

[19]  Ron Cytron,et al.  What's In a Name? -or- The Value of Renaming for Parallelism Detection and Storage Allocation , 1987, ICPP.

[20]  Bowen Liu,et al.  Programming in VS Fortran on the IBM 3090 for maximum vector performance , 1988, Computer.

[21]  W. Daniel Hillis,et al.  Data parallel algorithms , 1986, CACM.

[22]  Daniel P. Siewiorek,et al.  Parallel processing: the Cm* experience , 1986 .

[23]  Andrew R. Pleszkun,et al.  The performance potential of multiple functional unit processors , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[24]  David C. Cann,et al.  Applicative parallelism on a shared-memory multiprocessor , 1988, IEEE Software.

[25]  Dharma P. Agrawal,et al.  Performance of multiprocessor interconnection networks , 1989, Computer.

[26]  David A. Padua,et al.  Execution of Parallel Loops on Parallel Processor Systems , 1986, ICPP.

[27]  Arthur H. Veen,et al.  Dataflow machine architecture , 1986, CSUR.

[28]  D. V. Bhaskar Rao,et al.  Wavefront Array Processor: Language, Architecture, and Applications , 1982, IEEE Transactions on Computers.

[29]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[30]  David A. Padua,et al.  Compiler Generated Synchronization for Do Loops , 1986, ICPP.

[31]  Vivek Sarkar,et al.  Compile-time partitioning and scheduling of parallel programs , 1986, SIGPLAN '86.

[32]  John K. Ousterhout Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[33]  Alexandru Nicolau,et al.  Parallel processing: a smart compiler and a dumb machine , 1984, SIGP.

[34]  Michel Dubois,et al.  Effects of Cache Coherency in Multiprocessors , 1982, IEEE Transactions on Computers.

[35]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[36]  Richard M. Karp,et al.  Parallel Algorithms for Shared-Memory Machines , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[37]  Abhiram G. Ranade,et al.  How to emulate shared memory , 1991, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[38]  Alexander V. Veidenbaum,et al.  EFFECTS OF PROGRAM RESTRUCTURING, ALGORITHM CHANGE, AND ARCHITECTURE CHOICE ON PROGRAM PERFORMANCE. , 1984 .

[39]  Joel H. Saltz,et al.  A Comparative Analysis of Static and Dynamic Load Balancing Strategies , 1986, ICPP.

[40]  Shahid H. Bokhari,et al.  Dual Processor Scheduling with Dynamic Reassignment , 1979, IEEE Transactions on Software Engineering.

[41]  K. Mani Chandy,et al.  A comparison of list schedules for parallel processing systems , 1974, Commun. ACM.

[42]  Hironori Kasahara,et al.  APPROACH TO SUPERCOMPUTING USING MULTIPROCESSOR SCHEDULING ALGORITHMS. , 1985 .

[43]  David L. Waltz,et al.  The prospects for building truly intelligent machines , 1989 .

[44]  C. V. Ramamoorthy,et al.  Optimal Scheduling Strategies in a Multiprocessor System , 1972, IEEE Transactions on Computers.

[45]  Alan H. Karp,et al.  A comparison of 12 parallel FORTRAN dialects , 1988, IEEE Software.

[46]  Sartaj Sahni,et al.  Programming a hypercube multicomputer , 1988, IEEE Software.

[47]  Van den Bout,et al.  A digital signal processor and programming system for parallel signal processing , 1987 .

[48]  Wojciech Rytter,et al.  Efficient parallel algorithms , 1988 .

[49]  Andrew S. Tanenbaum,et al.  Distributed operating systems , 2009, CSUR.

[50]  Dharma P. Agrawal,et al.  Structure of a parallelizing compiler for the B-HIVE multicomputer☆ , 1988 .

[51]  Michael J. Flynn,et al.  Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.

[52]  John R. Ellis,et al.  Bulldog: a compiler for vliw architectures (parallel computing, reduced-instruction-set, trace scheduling, scientific) , 1985 .

[53]  David L. Waltz,et al.  Applications of the Connection Machine , 1990, Computer.

[54]  Boontee Kruatrachue,et al.  Grain size determination for parallel processing , 1988, IEEE Software.

[55]  Paul Budnik,et al.  The Organization and Use of Parallel Memories , 1971, IEEE Transactions on Computers.

[56]  Peiyi Tang,et al.  Dynamic Processor Self-Scheduling for General Parallel Nested Loops , 1987, IEEE Trans. Computers.

[57]  W. Daniel Hillis,et al.  The connection machine , 1985 .

[58]  D. J. Lalja,et al.  Reducing the branch penalty in pipelined processors , 1988, Computer.

[59]  David C. Cann,et al.  SISAL: Initial MIMD Performance Results , 1986, CONPAR.

[60]  Sukil Kim,et al.  Least-Squares Multiple Updating Algorithms on a Hypercube , 1990, J. Parallel Distributed Comput..

[61]  Dharma P. Agrawal,et al.  Design of software for distributed/multiprocessor systems , 1899, AFIPS '82.