Performance Of Cached Dram Organizations In Vector Supercomputers

DRAMs containing cache memory are studied in the context of vector supercomputers. In particular, we consider systems where processors have no internal data caches and memory reference streams are generated by vector instructions. For this application, we expect that cached DRAMs can provide high bandwidth at relatively low cost. We study both DRAMs with a single, long cache line and with smaller, multiple cache lines. Memory interleaving schemes that increase data localiry are proposed and studied. The interleaving schemes are also shown to lead to non-uniform bank accesses, i.e. hot banks. This suggests there is an important optimization problem involving methods that increase locality to improve performance, but not so much that hot banks diminish peflormance. We show that for uniprocessor systems, both types of cached DRAMs work well with the proposed interleave methods. For multiprogrammed multiprocessors, the multiple cache line DRAMs work better.