TLB Shootdown Mitigation for Low-Power Many-Core Servers with L1 Virtual Caches
暂无分享,去创建一个
Derek Hower | Abhishek Bhattacharjee | Binh Pham | Trey Cain | A. Bhattacharjee | Derek Hower | Trey Cain | B. Pham
[1] Michel Dubois,et al. VIRTUAL-ADDRESS CACHES , 1997 .
[2] Larry Carter,et al. Universal classes of hash functions (Extended Abstract) , 1977, STOC '77.
[3] Michael M. Swift,et al. Reducing memory reference energy with opportunistic virtual caching , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[4] Aamer Jaleel,et al. CoLT: Coalesced Large-Reach TLBs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[5] Thomas F. Wenisch,et al. Unlocking bandwidth for GPUs in CC-NUMA systems , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[6] Christoforos E. Kozyrakis,et al. Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[7] Gabriel H. Loh,et al. Increasing TLB reach by exploiting clustering in page translations , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[8] Michael M. Swift,et al. Efficient virtual memory for big memory servers , 2013, ISCA.
[9] Michel Dubois,et al. Virtual-address caches.2. Multiprocessor issues , 1997, IEEE Micro.
[10] Brian W. Barrett,et al. Introducing the Graph 500 , 2010 .
[11] Burton H. Bloom,et al. Space/time trade-offs in hash coding with allowable errors , 1970, CACM.
[12] Daniel Sánchez,et al. Implementing Signatures for Transactional Memory , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[13] Jaehyuk Huh,et al. Efficient synonym filtering and scalable delayed translation for hybrid virtual caching , 2016, International Symposium on Computer Architecture.
[14] Mark Oskin,et al. A Software-Managed Approach to Die-Stacked DRAM , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[15] Avi Mendelson,et al. DiDi: Mitigating the Performance Impact of TLB Shootdowns Using a Shared TLB Directory , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[16] Stefanos Kaxiras,et al. A new perspective for efficient virtual-cache coherence , 2013, ISCA.
[17] Ján Veselý,et al. Large pages and lightweight memory management in virtualized environments: Can you have it both ways? , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[18] Gurindar S. Sohi,et al. Revisiting virtual L1 caches: A practical design using dynamic synonym remapping , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[19] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[20] Daniel J. Sorin,et al. UNified Instruction/Translation/Data (UNITD) coherence: One protocol to rule them all , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.