Interference from GPU System Service Requests
暂无分享,去创建一个
[1] T Moody Adam,et al. System Noise Revisited: Enabling Application Scalability and Reproducibility with SMT , 2016 .
[2] Kathirgamar Aingaran,et al. Software in Silicon in the Oracle SPARC M7 processor , 2016, 2016 IEEE Hot Chips 28 Symposium (HCS).
[3] Ruud Haring,et al. The Blue Gene/Q Compute chip , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).
[4] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[5] Ján Veselý,et al. Generic System Calls for GPUs , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[6] Mahmut T. Kandemir,et al. VIP: Virtualizing IP chains on handheld platforms , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[7] T. Forshaw. Everything you always wanted to know , 1977 .
[8] Kenneth A. Ross,et al. Q100: the architecture and design of a database processing unit , 2014, ASPLOS.
[9] Christian Bienia,et al. PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .
[10] Thomas F. Wenisch,et al. Unlocking bandwidth for GPUs in CC-NUMA systems , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[11] David A. Wood,et al. Border control: Sandboxing accelerators , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[12] Greg Kroah-Hartman,et al. Linux Device Drivers , 1998 .
[13] Collin McCurdy,et al. The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.
[14] Ján Veselý,et al. Observations and opportunities in architecting shared virtual memory for heterogeneous systems , 2016, 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[15] Dan Bouvier,et al. Energy efficient graphics and multimedia in 28NM Carrizo APU , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).
[16] Ben Sander,et al. Applying AMD's Kaveri APU for heterogeneous computing , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).
[17] Andrew Siegel,et al. XSBENCH - THE DEVELOPMENT AND VERIFICATION OF A PERFORMANCE ABSTRACTION FOR MONTE CARLO REACTOR ANALYSIS , 2014 .
[18] Michael Stumm,et al. FlexSC: Flexible System Call Scheduling with Exception-Less System Calls , 2010, OSDI.
[19] Sudhakar Yalamanchili,et al. Coordinated energy management in heterogeneous processors , 2014, Sci. Program..
[20] Indrani Paul,et al. Understanding idle behavior and power gating mechanisms in the context of modern benchmarks on CPU-GPU Integrated systems , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[21] Shinobu Nagayama,et al. Hardware Accelerators for Regular Expression Matching and Approximate String Matching , 2009 .
[22] Andrew W. Appel,et al. Virtual memory primitives for user programs , 1991, ASPLOS IV.
[23] Per Hammarlund,et al. 4th generation Intel™ Core processor, codenamed Haswell , 2013, 2013 IEEE Hot Chips 25 Symposium (HCS).
[24] Craig M. Wittenbrink,et al. NVIDIA'S Tegra K1 system-on-chip , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).
[25] David A. Wood,et al. Crossing Guard: Mediating Host-Accelerator Coherence Interactions , 2017, ASPLOS.
[26] Antonio J. Peña,et al. Chai: Collaborative heterogeneous applications for integrated-architectures , 2017, 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[27] Mark Silberstein,et al. GPUrdma: GPU-side library for high performance networking from GPU kernels , 2016, ROSS@HPDC.
[28] Simha Sethumadhavan,et al. Security Implications of Third-Party Accelerators , 2016, IEEE Computer Architecture Letters.
[29] Idit Keidar,et al. GPUfs: Integrating a file system with GPUs , 2013, TOCS.
[30] K. K. Ramakrishnan,et al. Eliminating receive livelock in an interrupt-driven kernel , 1996, TOCS.
[31] Jeffrey C. Mogul,et al. TCP Offload Is a Dumb Idea Whose Time Has Come , 2003, HotOS.
[32] C. Genest,et al. Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask , 2007 .
[33] Ana Lucia Varbanescu,et al. KMA: A Dynamic Memory Manager for OpenCL , 2014, GPGPU@ASPLOS.
[34] Todd M. Austin,et al. A case for unlimited watchpoints , 2012, ASPLOS XVII.
[35] Kevin Skadron,et al. Pannotia: Understanding irregular GPGPU graph applications , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).
[36] Silvio Savarese,et al. EVA: An efficient vision architecture for mobile systems , 2013, 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).
[37] Sumti Jairath,et al. Next generation SPARC processor cache hierarchy , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).
[38] Irfan Ahmad,et al. vIC: Interrupt Coalescing for Virtual Machine Storage Device IO , 2011, USENIX Annual Technical Conference.
[39] Sudhakar Yalamanchili,et al. Cooperative boosting: needy versus greedy power management , 2013, ISCA.
[40] Mayank Daga,et al. Exploiting Coarse-Grained Parallelism in B+ Tree Searches on an APU , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[41] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[42] Xiangyu Li,et al. Hetero-mark, a benchmark suite for CPU-GPU collaborative computing , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).
[43] Mark Silberstein,et al. GPUnet , 2014, OSDI.
[44] Thomas F. Wenisch,et al. HARE: Hardware accelerator for regular expressions , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[45] Jeffrey Stuecheli,et al. CAPI: A Coherent Accelerator Processor Interface , 2015, IBM J. Res. Dev..