A Two-level Concurrent Address Translation Cache of High Performance Interconnect Network

Most of users are accustomed to utilize the virtual address in their parallel programs running at the exascale computer systems. Therefore the virtual and physical address translation mechanism is necessary and crucial to bridge the hardware interface and software application. We proposed a novel two-level concurrent address translation Cache (TLC) for high performance interconnect network–TH Express-2. The TLC is composed of L1 Cache (L1C) and main eDRAM-based Cache (MEC). A fast and small L1 Cache implemented by high-speed SRAM is adopted. The MEC employs the large capacity eDRAM (embedded Dynamic Random Access Memory) macros to meet the high hit ratio requirement. To avoid the stall incurring by refresh collision, a novel eDRAM stall-hidden refreshing algorithm is proposed. Many tests have been conducted on the real chip implementing TLC. The results show that the MEC has high hit ratio and L1C has considerable hit ratio while running the well-known benchmarks. Owing to the L1 Cache involved, the total runtime of TLC is reduced about 14%, only at the cost of 1.2% area occupied.