Leveraging on Deep Memory Hierarchies to Minimize Energy Consumption and Data Access Latency on Single-Chip Cloud Computers
暂无分享,去创建一个
Thanasis Loukopoulos | Samee Ullah Khan | Cheng-Zhong Xu | Sajjad Ahmad Madani | Nikos Tziritas | Tahir Maqsood
[1] José González,et al. Distributed Cooperative Caching: An Energy Efficient Memory Scheme for Chip Multiprocessors , 2012, IEEE Transactions on Parallel and Distributed Systems.
[2] Shaik Mahmed. A 16-Core Processor with Shared-Memory and Message-Passing Communications , 2015 .
[3] Hong He,et al. Task assignment in heterogeneous computing systems using an effective iterated greedy algorithm , 2011, J. Syst. Softw..
[4] Xu Cheng,et al. A 16-Core Processor With Shared-Memory and Message-Passing Communications , 2014, IEEE Transactions on Circuits and Systems I: Regular Papers.
[5] Wei-Che Tseng,et al. Data Allocation Optimization for Hybrid Scratch Pad Memory With SRAM and Nonvolatile Memory , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[6] Valentin Deaconu. Directed Graphs , 2010, Encyclopedia of Machine Learning.
[7] Edwin Hsing-Mean Sha,et al. Efficient assignment and scheduling for heterogeneous DSP systems , 2005, IEEE Transactions on Parallel and Distributed Systems.
[8] N. Muralimanohar,et al. CACTI 6 . 0 : A Tool to Understand Large Caches , 2007 .
[9] Wayne H. Wolf,et al. TGFF: task graphs for free , 1998, Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE'98).
[10] Vivek Sarkar,et al. Hierarchical Place Trees: A Portable Abstraction for Task Parallelism and Data Movement , 2009, LCPC.
[11] Quan Chen,et al. Adaptive Cache Aware Bitier Work-Stealing in Multisocket Multicore Architectures , 2013, IEEE Transactions on Parallel and Distributed Systems.
[12] Ajoy Kumar Datta,et al. CPU Scheduling for Power/Energy Management on Multicore Processors Using Cache Miss and Context Switch Data , 2014, IEEE Transactions on Parallel and Distributed Systems.
[13] Tulika Mitra,et al. Integrated scratchpad memory optimization and task scheduling for MPSoC architectures , 2006, CASES '06.
[14] Karthick Rajamani,et al. Tiered Memory: An Iso-Power Memory Architecture to Address the Memory Power Wall , 2012, IEEE Transactions on Computers.
[15] Kenli Li,et al. Energy-Aware Data Allocation and Task Scheduling on Heterogeneous Multiprocessor Systems With Time Constraints , 2014, IEEE Transactions on Emerging Topics in Computing.
[16] César A. M. Marcon,et al. Partitioning and mapping on NoC-Based MPSoC: an energy consumption saving approach , 2011, NoCArc '11.
[17] Lothar Thiele,et al. Dynamic Power-Aware Mapping of Applications onto Heterogeneous MPSoC Platforms , 2010, IEEE Transactions on Industrial Informatics.
[18] Andrew A. Chien,et al. The future of microprocessors , 2011, Commun. ACM.
[19] Meikang Qiu,et al. Data Placement and Duplication for Embedded Multicore Systems With Scratch Pad Memory , 2013, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[20] Wei Zhang,et al. Hybrid SPM-cache architectures to achieve high time predictability and performance , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.
[21] David Daly,et al. The cache and memory subsystems of the IBM POWER8 processor , 2015, IBM J. Res. Dev..
[22] Yi He,et al. Co-optimization of memory access and task scheduling on MPSoC architectures with multi-level memory , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).
[23] Nikil D. Dutt,et al. NoC-based fault-tolerant cache design in chip multiprocessors , 2014, ACM Trans. Embed. Comput. Syst..
[24] Jean-Luc Dekeyser,et al. Estimating Energy Consumption for an MPSoC Architectural Exploration , 2006, ARCS.
[25] Nikil D. Dutt,et al. HaVOC: A hybrid memory-aware virtualization layer for on-chip distributed ScratchPad and Non-Volatile Memories , 2012, DAC Design Automation Conference 2012.
[26] Wei-Che Tseng,et al. Minimizing Access Cost for Multiple Types of Memory Units in Embedded Systems Through Data Allocation and Scheduling , 2012, IEEE Transactions on Signal Processing.
[27] Y.-K. Kwok,et al. Static scheduling algorithms for allocating directed task graphs to multiprocessors , 1999, CSUR.
[28] Nikil D. Dutt,et al. A novel NoC-based design for fault-tolerance of last-level caches in CMPs , 2012, CODES+ISSS '12.
[29] Hai Jin,et al. DAGMap: efficient and dependable scheduling of DAG workflow job in Grid , 2010, The Journal of Supercomputing.
[30] Naehyuck Chang,et al. System-Level Performance and Power Optimization for MPSoC , 2015, ACM Trans. Embed. Comput. Syst..
[31] Christoforos E. Kozyrakis,et al. Towards energy-proportional datacenter memory with mobile DRAM , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[32] Mani Azimi,et al. Integration Challenges and Tradeoffs for Terascale Architectures , 2007 .
[33] David Wentzlaff,et al. Processor: A 64-Core SoC with Mesh Interconnect , 2010 .
[34] Partha Pratim Pande,et al. Performance evaluation and design trade-offs for wireless network-on-chip architectures , 2012, JETC.