论文信息 - Scheduling-Aware Prefetching: Enabling the PCIe SSD to Extend the Global Memory of GPU Device

Scheduling-Aware Prefetching: Enabling the PCIe SSD to Extend the Global Memory of GPU Device

The evolution of Cyber-Physical Systems (CPSs) and Internet of Things (IoTs) enables mobile and smart embedded devices to be equipped with embedded GPUs for accelerating data-intensive applications. To cut down device prices and reduce energy consumption, current GPUs adopt the unified memory architecture to extend memory size with using the PCIe SSD which is cheaper than directly enlarging the off-chip DRAM on the GPU. However, adopting the unified memory architecture, data shall be moved to the host DRAM before being moved to the off-chip DRAM and thus it leads to serious contention issues among CPUs and GPUs on the host DRAM. Although the advent of new communication technology provides the opportunity for GPUs to directly access the PCIe SSD without passing the host DRAM, it leads to high data movement costs because the latency gap between the off-chip DRAM and the PCIe SSD is large. To enhance the performance of the low-cost energy-efficient GPU memory systems, this work advocates a hardware-controller-based memory extension solution to not only avoid the contention issues on the host DRAM but also reduce the data movement costs. Particularly, we propose a scheduling-aware prefetching design to perform data prefetching by utilizing the information from the hardware warp scheduler. The proposed solution was evaluated by a series of intensive experiments and the results are encouraging.

Tei-Wei Kuo | Yuan-Hao Chang | Che-Wei Tsao | Tse-Yuan Wang | Chun-Feng Wu