A Hardware / Software Approach

[1]  Doug Burger System-Level Implications of Processor-Memory Integration , 1997 .

[2]  Kunle Olukotun,et al.  The case for a single-chip multiprocessor , 1996, ASPLOS VII.

[3]  Fong Pong,et al.  Missing the Memory Wall: The Case for Processor/Memory Integration , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[4]  D. Burger,et al.  Memory Bandwidth Limitations of Future Microprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[5]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[6]  Jack J. Dongarra,et al.  Software Libraries for Linear Algebra Computations on High Performance Computers , 1995, SIAM Rev..

[7]  Maya Gokhale,et al.  Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.

[8]  David A. Wood,et al.  Cost-Effective Parallel Computing , 1995, Computer.

[9]  Peter M. Kogge,et al.  EXECUBE-A New Architecture for Scaleable MPPs , 1994, 1994 International Conference on Parallel Processing Vol. 1.

[10]  Marc Levoy,et al.  Parallel visualization algorithms: performance and architectural implications , 1994, Computer.

[11]  M. J. Carlton,et al.  Micro benchmark analysis of the KSR1 , 1993, Supercomputing '93.

[12]  Guy L. Steele,et al.  The High Performance Fortran Handbook , 1993 .

[13]  Jack Dongarra,et al.  ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[14]  Duncan G. Elliott,et al.  Computational Ram: A Memory-simd Hybrid And Its Application To Dsp , 1992, 1992 Proceedings of the IEEE Custom Integrated Circuits Conference.

[15]  Yoshihiro Fujita,et al.  FA 15.2: A 3.84GIPS Integrated Memory Array Processor LSI with 64 Processing Elements and 2Mb SRAM , 1992 .

[16]  Anthony J. G. Hey,et al.  The Genesis distributed memory benchmarks , 1991, Parallel Comput..

[17]  Guy E. Blelloch,et al.  A comparison of sorting algorithms for the connection machine CM-2 , 1991, SPAA '91.

[18]  John R. Nickolls,et al.  The design of the MasPar MP-1: a cost effective massively parallel computer , 1990, Digest of Papers Compcon Spring '90. Thirty-Fifth IEEE Computer Society International Conference on Intellectual Leverage.

[19]  David H. Bailey,et al.  FFTs in external or hierarchical memory , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[20]  L. W. Tucker,et al.  Architecture and applications of the Connection Machine , 1988, Computer.

[21]  J. Hennessy,et al.  Performance tradeoffs in cache design , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[22]  John L. Gustafson,et al.  Reevaluating Amdahl's law , 1988, CACM.

[23]  L. Hernquist,et al.  Performance characteristics of tree codes , 1987 .

[24]  Jack Dongarra,et al.  Computer benchmarking: paths and pitfalls , 1987 .

[25]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[26]  W. Daniel Hillis,et al.  The connection machine , 1985 .

[27]  Jack J. Dongarra,et al.  Performance of various computers using standard linear equations software in a Fortran environment , 1987, SGNM.

[28]  Henry Fuchs,et al.  Near real-time shaded display of rigid objects , 1983, SIGGRAPH.

[29]  Kenneth E. Batcher,et al.  Design of a Massively Parallel Processor , 1980, IEEE Transactions on Computers.

[30]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[31]  Charles R. Vick,et al.  PEPE architecture - present and future , 1978, AFIPS National Computer Conference.

[32]  S. F. Reddaway DAP—a distributed array processor , 1973, ISCA 1973.

[33]  Harold S. Stone,et al.  A Logic-in-Memory Computer , 1970, IEEE Transactions on Computers.

[34]  Daniel L. Slotnick Unconventional systems , 1967, AFIPS '67 (Spring).

[35]  Daniel L. Slotnick,et al.  The SOLOMON computer , 1962, AFIPS '62 (Fall).

[36]  T. A. Jeeves,et al.  On the use of the SOLOMON parallel-processing computer , 1899, AFIPS '62 (Fall).