Design Space Exploration of On-chip Ring Interconnection for a CPU-GPU Architecture
暂无分享,去创建一个
[1] Stephen W. Keckler,et al. Regional congestion awareness for load balance in networks-on-chip , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[2] Chita R. Das,et al. A case for heterogeneous on-chip interconnects for CMPs , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[3] Zeljko Zilic,et al. A Hybrid Ring/Mesh Interconnect for Network-on-Chip Using Hierarchical Rings for Global Routing , 2007, First International Symposium on Networks-on-Chip (NOCS'07).
[4] Hsien-Hsin S. Lee,et al. COMPASS: a programmable data prefetcher using idle GPU shaders , 2010, ASPLOS XV.
[5] Yale N. Patt,et al. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[6] Gul N. Khan,et al. Throughput-Oriented NoC Topology Generation and Analysis for High Performance SoCs , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[7] Onur Mutlu,et al. Kilo-NOC: A heterogeneous network-on-chip architecture for scalability and service guarantees , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[8] G. Edward Suh,et al. A new memory monitoring scheme for memory-aware scheduling and partitioning , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[9] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[10] Radu Marculescu,et al. Exploiting the Routing Flexibility for Energy/Performance Aware Mapping of Regular NoC Architectures , 2003, DATE.
[11] Chita R. Das,et al. Application-aware prioritization mechanisms for on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[12] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[13] Hyesoon Kim,et al. TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[14] Onur Mutlu,et al. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems , 2008, 2008 International Symposium on Computer Architecture.
[15] Edward T. Grochowski,et al. Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[16] Cruz Izu,et al. The Adaptive Bubble Router , 2001, J. Parallel Distributed Comput..
[17] Rajiv Kapoor,et al. Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[18] Michael J. Schulte,et al. ERCBench: An Open-Source Benchmark Suite for Embedded and Reconfigurable Computing , 2010, 2010 International Conference on Field Programmable Logic and Applications.
[19] John Kim,et al. Throughput-Effective On-Chip Networks for Manycore Accelerators , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[20] Onur Mutlu,et al. A case for bufferless routing in on-chip networks , 2009, ISCA '09.
[21] George Michelogiannakis,et al. An analysis of on-chip interconnection networks for large-scale chip multiprocessors , 2010, TACO.
[22] Gabriel H. Loh,et al. PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches , 2009, ISCA '09.
[23] William J. Dally,et al. Flattened butterfly: a cost-efficient topology for high-radix networks , 2007, ISCA '07.
[24] Onur Mutlu,et al. Express Cube Topologies for on-Chip Interconnects , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[25] Aamer Jaleel,et al. High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.
[26] Chita R. Das,et al. Aérgia: exploiting packet latency slack in on-chip networks , 2010, ISCA.
[27] Nicola Concer,et al. aEqualized: A novel routing algorithm for the Spidergon Network On Chip , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.
[28] Natalie D. Enright Jerger,et al. DBAR: An efficient routing algorithm to support multiple concurrent applications in networks-on-chip , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[29] Miltos D. Grammatikakis,et al. NoC Topologies Exploration based on Mapping and Simulation Models , 2007, 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools (DSD 2007).
[30] Timothy Mark Pinkston,et al. On Characterizing Performance of the Cell Broadband Engine Element Interconnect Bus , 2007, First International Symposium on Networks-on-Chip (NOCS'07).
[31] Radu Marculescu,et al. DyAD - smart routing for networks-on-chip , 2004, Proceedings. 41st Design Automation Conference, 2004..
[32] Natalie D. Enright Jerger,et al. Achieving predictable performance through better memory controller placement in many-core CMPs , 2009, ISCA '09.