A new perspective for efficient virtual-cache coherence
暂无分享,去创建一个
[1] Antonio Robles,et al. Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[2] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.
[3] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[4] Jaehyuk Huh,et al. Subspace snooping: Filtering snoops with operating system support , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[5] Stefanos Kaxiras,et al. SARC Coherence: Scaling Directory Cache Coherence in Performance and Power , 2010, IEEE Micro.
[6] James R. Larus,et al. SPUR: A VLSI Multiprocessor Workstation , 1985 .
[7] Michel Dubois,et al. VIRTUAL-ADDRESS CACHES , 1997 .
[8] Patricia J. Teller. Translation-lookaside buffer consistency , 1990, Computer.
[9] Hong Jiang,et al. Pangaea: A tightly-coupled IA32 heterogeneous chip multiprocessor , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[10] Sarita V. Adve,et al. DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[11] Margaret Martonosi,et al. Shared last-level TLBs for chip multiprocessors , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[12] James R. Larus,et al. Mechanisms for Cooperative Shared Memory , 1994 .
[13] Brian N. Bershad,et al. Consistency management for virtually indexed caches , 1992, ASPLOS V.
[14] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[15] Niraj K. Jha,et al. GARNET: A detailed on-chip network model inside a full-system simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[16] Mike O'Connor,et al. Cache coherence for GPU architectures , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[17] Leslie Kohn,et al. Introducing the Intel i860 64-bit microprocessor , 1989, IEEE Micro.
[18] Milon Mackey,et al. Mach on a Virtually Addressed Cache Architecture , 1990, USENIX MACH Symposium.
[19] Lixin Zhang,et al. Enigma: architectural and operating system support for reducing the impact of address translation , 2010, ICS '10.
[20] David L. Black,et al. Translation lookaside buffer consistency: a software approach , 1989, ASPLOS III.
[21] Michel Dubois,et al. The Synonym Lookaside Buffer: A Solution to the Synonym Problem in Virtual Caches , 2008, IEEE Transactions on Computers.
[22] Milo M. K. Martin,et al. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.
[23] W. H. Wang,et al. Organization and performance of a two-level virtual-real cache hierarchy , 1989, ISCA '89.
[24] James R. Larus,et al. Cooperative shared memory: software and hardware for scalable multiprocessors , 1993, TOCS.
[25] Michel Dubois,et al. Virtual-address caches.2. Multiprocessor issues , 1997, IEEE Micro.
[26] David R. Cheriton,et al. Software-Controlled Caches in the VMP Multiprocessor , 1986, ISCA.
[27] Michael M. Swift,et al. Reducing memory reference energy with opportunistic virtual caching , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[28] Milind Girkar,et al. EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system , 2007, PLDI '07.
[29] James R. Goodman. Coherency for multiprocessor virtual address caches , 1987, ASPLOS 1987.
[30] David A. Wood,et al. Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[31] Michel Cekleov,et al. Virtual-address caches. Part 1: problems and solutions in uniprocessors , 1997, IEEE Micro.
[32] David A. Wood,et al. A Primer on Memory Consistency and Cache Coherence , 2012, Synthesis Lectures on Computer Architecture.
[33] M. Dubois,et al. Tolerating late memory traps in ILP processors , 1999, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367).
[34] Norman P. Jouppi,et al. Architectural And Organizational Tradeoffs In The Design Of The Multititan CPU , 1989, The 16th Annual International Symposium on Computer Architecture.
[35] Babak Falsafi,et al. Reactive NUCA: near-optimal block placement and replication in distributed caches , 2009, ISCA '09.
[36] Stefanos Kaxiras,et al. Complexity-effective multicore coherence , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).