The preliminary evaluation of MBP-light with two protocol policies for a massively parallel processor-JUMP-1

A massively parallel processor called JUMP-1 has been developed to build an efficient cache coherent-distributed shared memory (DSM) on a large system with more than 1000 processors. Here, the dedicated processor called MBP (Memory Based Processor)-light to manage the DSM of JUMP-1 is introduced, and its preliminary performance with two protocol policies-update/invalidate-is evaluated. From results of its simulation, it appears that simple operations like the tag check and the collection/generation of acknowledgment packets are mostly processed by the hardware mechanisms in MBP-light without the aids of the core processor with both policies. Also, the buffer-register architecture adopted by the core processor in MBP-light is exploited enough to process a protocol transaction for both policies.

[1]  T. Lovett,et al.  STiNG: A CC-NUMA Computer System for the Commercial Marketplace , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[2]  Inoue Hiroaki,et al.  MBP-light: A Processor for Management of Distributed Shared Memory , 1998 .

[3]  Thorsten von Eicken,et al.  技術解説 IEEE Computer , 1999 .

[4]  Hiroshi Nakashima,et al.  Overview of the JUMP-1, an MPP prototype for general-purpose parallel computations , 1994, Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN).

[5]  Hideharu Amano,et al.  Hierarchical Bit-Map Directory Schemes on the RDT Interconnection Network for a Massively Parallel Processor JUMP-1 , 1995, ICPP.

[6]  D. Lenoski,et al.  The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[7]  Hiroshi Nakashima,et al.  The intelligent cache controller of a massively parallel processor JUMP-I , 1997, Proceedings Innovative Architecture for Future Generation High-Performance Processors and Systems.

[8]  Hideharu Amano,et al.  Shared vs. Snoop: Evaluation of Cache Structure for Single-Chip Multiprocessors , 1997, Euro-Par.

[9]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[10]  Anoop Gupta,et al.  The Stanford Dash multiprocessor , 1992, Computer.

[11]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[12]  Anant Agarwal,et al.  Software-extended coherent shared memory: performance and cost , 1994, ISCA '94.

[13]  Hideharu Amano,et al.  Recursive Diagonal Torus: An Interconnection Network for Massively Parallel Computers , 2001, IEEE Trans. Parallel Distributed Syst..

[14]  Anoop Gupta,et al.  The Stanford FLASH multiprocessor , 1994, ISCA '94.

[15]  Kei Hiraki,et al.  Distributed shared memory architecture for JUMP-1 a general-purpose MPP prototype , 1996, Proceedings Second International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'96).

[16]  Hideharu Amano,et al.  The RDT Router Chip: A Versatile Router for Supporting a Distributed Shared Memory , 1997 .

[17]  J. J. Kim,et al.  XMESH interconnection network for massively parallel computers , 1996 .