A case study of optimizing parallel computation and remote memory access
暂无分享,去创建一个
Computation performance and memory access performance are recognized as important features to support high performance computing in a sequential processing. For parallel computers, communication performance between processing elements (PEs) must be added as another important feature. And more important thing for designing massively parallel computers is to make an optimal balance among above three features. Especially, since it is hard to guarantee a fixed access latency between PEs in massively parallel computers, local computation and global communication must be balanced to achieve efficient parallel processing. The point is that an architecture of massively parallel computers must be optimized as a total system including compiler technic and load balancing mechanism. EM-X is designed as a test-bed for evaluating many architectural features on massively parallel systems. It is designed to support the following fundamental issues of parallel architecture. latency reduction by fusing communication pipeline with execution pipeline latency hiding by multi-threading with quick thread switching run-time latency minimization for remote memory access
[1] Mitsuhisa Sato,et al. EMC-Y: parallel processing element optimizing communication and computation , 1993, ICS '93.
[2] Shuichi Sakai,et al. An Architectural Disgn of a Highly Parallel Dataflow Machine , 1989, IFIP Congress.