HSAemu 2.0: Full System Emulation for HSA platforms with Soft-MMU

With the increasing computing complexity and the proliferation of data, the world demands efficient, next-generation system architecture to enable large-scale applications at acceptable costs. Heterogeneous computing has become a hot topic and a solution to achieve the goals of high performance and efficient power consumption, especially when graphical processing units (GPU's) are constantly integrated into systems-on-chips (SoC's) and are widely used for mobile devices. Heterogeneous System Architecture (HSA) is a series of standards provided by the HSA Foundation and designed to support heterogeneous computing, including runtime software and hardware specifications. To support the development and optimization of HSA-compliant systems and applications, we developed a full-system emulator, called HSAemu 2.0, which meets the latest HSA 1.0 system specifications and supports application development with OpenCL 2.0 features, such as shared virtual memory, device enqueue and pipe. As a hardware/software co-design tool, HSAemu 2.0 not only supports the development of heterogeneous applications, but also assists system vendors in designing and evaluating the HSA runtime libraries, HSAIL compiler, and HSA hardware.

[1]  William J. Dally,et al.  The GPU Computing Era , 2010, IEEE Micro.

[2]  Philip Ross,et al.  Why CPU Frequency Stalled , 2008, IEEE Spectrum.

[3]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[4]  П. Довгалюк,et al.  Два способа организации механизма полносистемного детерминированного воспроизведения в симуляторе QEMU , 2012 .

[5]  Wen Tang,et al.  Accelerating Millions of Short Reads Mapping on a Heterogeneous Architecture with FPGA Accelerator , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[6]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[7]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[8]  Yeh-Ching Chung,et al.  PQEMU: A Parallel System Emulator Based on QEMU , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.

[9]  Henry Wong,et al.  Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[10]  Pedro López,et al.  Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors , 2007, 19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07).

[11]  Metin Nafi Gürcan,et al.  Coordinating the use of GPU and CPU for improving performance of compute intensive applications , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[12]  Julio Sahuquillo,et al.  Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors , 2007 .

[13]  Haibo Chen,et al.  COREMU: a scalable and portable parallel full-system emulator , 2011, PPoPP '11.

[14]  Yeh-Ching Chung,et al.  HSAemu - A full system emulator for HSA platforms , 2014, 2014 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[15]  David A. Wood,et al.  gem5-gpu: A Heterogeneous CPU-GPU Simulator , 2015, IEEE Computer Architecture Letters.