Application-to-core mapping policies to reduce memory interference in multi-core systems
暂无分享,去创建一个
Reetuparna Das | Rachata Ausavarungnirun | Onur Mutlu | Akhilesh Kumar | Mani Azimi | O. Mutlu | Rachata Ausavarungnirun | R. Das | Akhilesh Kumar | M. Azimi
[1] Shashi Kumar,et al. A two-step genetic algorithm for mapping task graphs to a network on chip architecture , 2003, Euromicro Symposium on Digital System Design, 2003. Proceedings..
[2] Tong Li,et al. Operating system support for overlapping-ISA heterogeneous multi-core architectures , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[3] Rajiv Kapoor,et al. Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[4] Natalie D. Enright Jerger,et al. DBAR: An efficient routing algorithm to support multiple concurrent applications in networks-on-chip , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[5] Chita R. Das,et al. Aérgia: exploiting packet latency slack in on-chip networks , 2010, ISCA.
[6] Onur Mutlu,et al. A QoS-Enabled On-Die Interconnect Fabric for Kilo-Node Chips , 2012, IEEE Micro.
[7] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[8] Jichuan Chang,et al. Cooperative Caching for Chip Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[9] David W. Nellans,et al. Handling the problems and opportunities posed by multiple on-chip memory controllers , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[10] Sharad Malik,et al. Orion: a power-performance simulator for interconnection networks , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[11] Zhao Zhang,et al. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality , 2000, MICRO 33.
[12] Tong Li,et al. Using OS Observations to Improve Performance in Multicore Systems , 2008, IEEE Micro.
[13] Krste Asanovic,et al. Reducing power density through activity migration , 2003, ISLPED '03.
[14] Onur Mutlu,et al. Bottleneck identification and scheduling in multithreaded applications , 2012, ASPLOS XVII.
[15] Alexandra Fedorova,et al. Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS 2010.
[16] D.A. Wood,et al. Reactive NUMA: A Design For Unifying S-COMA And CC-NUMA , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[17] Moinuddin K. Qureshi. Adaptive Spill-Receive for robust high-performance caching in CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[18] Onur Mutlu,et al. Accelerating critical section execution with asymmetric multi-core architectures , 2009, ASPLOS.
[19] Sai Prashanth Muralidhara,et al. Reducing memory interference in multicore systems via application-aware memory channel partitioning , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[20] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .
[21] Brian N. Bershad,et al. Avoiding conflict misses dynamically in large direct-mapped caches , 1994, ASPLOS VI.
[22] Chris Fallin,et al. Next generation on-chip networks: what kind of congestion control do we need? , 2010, Hotnets-IX.
[23] Srinivasan Murali,et al. Bandwidth-constrained mapping of cores onto NoC architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.
[24] Reetuparna Das,et al. Application-to-Core Mapping Policies to Reduce Interference in On-Chip Networks , 2011 .
[25] Tajana Simunic,et al. Evaluating the impact of job scheduling and power management on processor lifetime for chip multiprocessors , 2009, SIGMETRICS '09.
[26] Mor Harchol-Balter,et al. ATLAS : A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers , 2010 .
[27] Radu Marculescu,et al. Energy-aware mapping for tile-based NoC architectures under performance constraints , 2003, ASP-DAC '03.
[28] Chita R. Das,et al. Application-aware prioritization mechanisms for on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[29] Mor Harchol-Balter,et al. Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[30] Anoop Gupta,et al. Scheduling and page migration for multiprocessor compute servers , 1994, ASPLOS VI.
[31] Onur Mutlu,et al. Preemptive Virtual Clock: A flexible, efficient, and cost-effective QOS scheme for networks-on-chip , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[32] Sangyeun Cho,et al. Managing Distributed, Shared L2 Caches through OS-Level Page Allocation , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[33] Murali Annavaram,et al. Mitigating Amdahl's Law through EPI Throttling , 2005, ISCA 2005.
[34] Michael D. Smith,et al. Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[35] Onur Mutlu,et al. Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance , 2006, IEEE Micro.
[36] Faye A. Briggs,et al. A study of performance impact of memory controller features in multi-processor server environment , 2004, WMPI '04.
[37] O. Mutlu,et al. Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems , 2010, ASPLOS XV.
[38] Onur Mutlu,et al. Kilo-NOC: A heterogeneous network-on-chip architecture for scalability and service guarantees , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[39] Song Jiang,et al. CLOCK-Pro: An Effective Improvement of the CLOCK Replacement , 2005, USENIX ATC, General Track.
[40] Gu-Yeon Wei,et al. Thread motion: fine-grained power management for multi-core systems , 2009, ISCA '09.
[41] Stijn Eyerman,et al. System-Level Performance Metrics for Multiprogram Workloads , 2008, IEEE Micro.
[42] Natalie D. Enright Jerger,et al. Achieving predictable performance through better memory controller placement in many-core CMPs , 2009, ISCA '09.
[43] Guy E. Blelloch,et al. Scheduling threads for constructive cache sharing on CMPs , 2007, SPAA '07.
[44] T. N. Vijaykumar,et al. Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.