Software Assisted Hardware Cache Coherence for Heterogeneous Processors
暂无分享,去创建一个
[1] Andreas Moshovos,et al. A Framework for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[2] David A. Wood,et al. Heterogeneous system coherence for integrated CPU-GPU systems , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[3] Mark Silberstein,et al. PTask: operating system abstractions to manage GPUs as compute devices , 2011, SOSP.
[4] Babak Falsafi,et al. Multi-grain coherence directories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[5] Jaehyuk Huh,et al. Subspace snooping: Filtering snoops with operating system support , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[6] Mike O'Connor,et al. Cache coherence for GPU architectures , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[7] Babak Falsafi,et al. Reactive NUCA: near-optimal block placement and replication in distributed caches , 2009, ISCA '09.
[8] Niraj K. Jha,et al. In-Network Coherence Filtering: Snoopy coherence without broadcasts , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[9] Babak Falsafi,et al. JETTY: filtering snoops for reduced energy consumption in SMP servers , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[10] Babak Falsafi,et al. TurboTag: Lookup filtering to reduce coherence directory power , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).
[11] William J. Dally,et al. Stream register files with indexed access , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[12] Sanjay J. Patel,et al. Cohesion: a hybrid memory model for accelerators , 2010, ISCA.
[13] Andreas Moshovos. RegionScout: exploiting coarse grain sharing in snoop-based coherence , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[14] Kevin Skadron,et al. A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).
[15] John E. Stone,et al. An asymmetric distributed shared memory model for heterogeneous parallel systems , 2010, ASPLOS XV.
[16] Natalie D. Enright Jerger,et al. Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[17] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[18] Mohammad Alisafaee. Spatiotemporal Coherence Tracking , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.