Structure research in a sparse directory co-located with last-level cache

Directory is adopted by most CC-NUMA direct connection multiprocessor system. As a result, how to implement the directory is one of the key issues for the system performance. In this paper, we propose a novel technology, which divides up some space from the last-level cache (e.g., tertiary cache, L3) for storing the directory information. The directory space size could be adjusted flexibly according to the characteristics of different applications. What's more, this approach offers high-level parallelism while suffering very little cost in the complexity of design, and almost has no effect on cache hit rate, when set properly.

[1]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[2]  Maged M. Michael,et al.  High-throughout coherence control and hardware messaging in Everest , 2001, IBM J. Res. Dev..

[3]  K. Gharachorloo,et al.  Architecture and design of AlphaServer GS320 , 2000, ASPLOS IX.

[4]  D. Lenoski,et al.  The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[5]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.

[6]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[7]  Anoop Gupta,et al.  Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes , 1990, ICPP.

[8]  Li Jun,et al.  A Novel Directory-Based Non-busy, Non-blocking Cache Coherence , 2009, 2009 International Forum on Computer Science-Technology and Applications.

[9]  Gary Lauterbach,et al.  UltraSPARC-III: designing third-generation 64-bit performance , 1999, IEEE Micro.

[10]  Anoop Gupta,et al.  The DASH Prototype: Logic Overhead and Performance , 1993, IEEE Trans. Parallel Distributed Syst..

[11]  Luiz André Barroso,et al.  Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).