Application-to-Core Mapping Policies to Reduce Interference in On-Chip Networks
暂无分享,去创建一个
[1] Chita R. Das,et al. Aérgia: exploiting packet latency slack in on-chip networks , 2010, ISCA.
[2] Radu Marculescu,et al. DyAD - smart routing for networks-on-chip , 2004, Proceedings. 41st Design Automation Conference, 2004..
[3] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[4] Jichuan Chang,et al. Cooperative Caching for Chip Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[5] Krste Asanovic,et al. Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks , 2008, 2008 International Symposium on Computer Architecture.
[6] Anoop Gupta,et al. Scheduling and page migration for multiprocessor compute servers , 1994, ASPLOS VI.
[7] Richard E. Kessler,et al. Page placement algorithms for large real-indexed caches , 1992, TOCS.
[8] Onur Mutlu,et al. Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance , 2006, IEEE Micro.
[9] D.A. Wood,et al. Reactive NUMA: A Design For Unifying S-COMA And CC-NUMA , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[10] Moinuddin K. Qureshi. Adaptive Spill-Receive for robust high-performance caching in CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[11] Shashi Kumar,et al. A two-step genetic algorithm for mapping task graphs to a network on chip architecture , 2003, Euromicro Symposium on Digital System Design, 2003. Proceedings..
[12] Onur Mutlu,et al. Preemptive Virtual Clock: A flexible, efficient, and cost-effective QOS scheme for networks-on-chip , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[13] Radu Marculescu,et al. User-Aware Dynamic Task Allocation in Networks-on-Chip , 2008, 2008 Design, Automation and Test in Europe.
[14] John K. Bennett,et al. Efficient user-level thread migration and checkpointing on windows NT clusters , 1999 .
[15] Onur Mutlu,et al. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems , 2008, 2008 International Symposium on Computer Architecture.
[16] Radu Marculescu,et al. Contention-aware application mapping for Network-on-Chip communication architectures , 2008, 2008 IEEE International Conference on Computer Design.
[17] T. N. Vijaykumar,et al. Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.
[18] Tong Li,et al. Operating system support for overlapping-ISA heterogeneous multi-core architectures , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[19] Brian N. Bershad,et al. Avoiding conflict misses dynamically in large direct-mapped caches , 1994, ASPLOS VI.
[20] Srinivasan Murali,et al. Bandwidth-constrained mapping of cores onto NoC architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.
[21] Stephen W. Keckler,et al. Regional congestion awareness for load balance in networks-on-chip , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[22] Sharad Malik,et al. Orion: a power-performance simulator for interconnection networks , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[23] Anoop Gupta,et al. OS Support for Improving Data Locality on CC-NUMA Compute Servers , 1996 .
[24] Zhao Zhang,et al. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality , 2000, MICRO 33.
[25] Rajiv Kapoor,et al. Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[26] David W. Nellans,et al. Handling the problems and opportunities posed by multiple on-chip memory controllers , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[27] Tajana Simunic,et al. Evaluating the impact of job scheduling and power management on processor lifetime for chip multiprocessors , 2009, SIGMETRICS '09.
[28] MutluOnur,et al. Efficient Runahead Execution , 2006 .
[29] Mor Harchol-Balter,et al. ATLAS : A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers , 2010 .
[30] Chita R. Das,et al. Application-aware prioritization mechanisms for on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[31] Sangyeun Cho,et al. Managing Distributed, Shared L2 Caches through OS-Level Page Allocation , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[32] Guy E. Blelloch,et al. Scheduling threads for constructive cache sharing on CMPs , 2007, SPAA '07.
[33] Jingcao Hu,et al. Energy-aware mapping for tile-based NoC architectures under performance constraints , 2003, Proceedings of the ASP-DAC Asia and South Pacific Design Automation Conference, 2003..
[34] R. K. Shyamasundar,et al. Introduction to algorithms , 1996 .
[35] Stijn Eyerman,et al. System-Level Performance Metrics for Multiprogram Workloads , 2008, IEEE Micro.
[36] Natalie D. Enright Jerger,et al. Achieving predictable performance through better memory controller placement in many-core CMPs , 2009, ISCA '09.
[37] Krste Asanovic,et al. Reducing power density through activity migration , 2003, ISLPED '03.
[38] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .
[39] Faye A. Briggs,et al. A study of performance impact of memory controller features in multi-processor server environment , 2004, WMPI '04.
[40] Song Jiang,et al. CLOCK-Pro: An Effective Improvement of the CLOCK Replacement , 2005, USENIX ATC, General Track.
[41] Gu-Yeon Wei,et al. Thread motion: fine-grained power management for multi-core systems , 2009, ISCA '09.