Low-Overhead Network-on-Chip Support for Location-Oblivious Task Placement
暂无分享,去创建一个
John Kim | Gwangsun Kim | Dennis Abts | Jae W. Lee | Michael R. Marty | Michael Mihn-Jong Lee | D. Abts | Gwangsun Kim | John Kim | M. J. Lee
[1] Deborah K. Weisser,et al. Age-based packet arbitration in large-radix k-ary n-cubes , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[2] Scott Shenker,et al. Analysis and simulation of a fair queueing algorithm , 1989, SIGCOMM 1989.
[3] Onur Mutlu,et al. Preemptive Virtual Clock: A flexible, efficient, and cost-effective QOS scheme for networks-on-chip , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[4] Nam Sung Kim,et al. The case for GPGPU spatial multitasking , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[5] Ran Ginosar,et al. QNoC: QoS architecture and design process for network on chip , 2004, J. Syst. Archit..
[6] Norman P. Jouppi,et al. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[7] William J. Dally,et al. Allocator implementations for network-on-chip routers , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[8] John Kim,et al. Probabilistic Distance-Based Arbitration: Providing Equality of Service for Many-Core CMPs , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[9] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[10] Alexandra Fedorova,et al. Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS XV.
[11] Yuan Xie,et al. LOFT: A High Performance Network-on-Chip Providing Quality-of-Service Support , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[12] Natalie D. Enright Jerger,et al. Achieving predictable performance through better memory controller placement in many-core CMPs , 2009, ISCA '09.
[13] Krste Asanovic,et al. Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks , 2008, 2008 International Symposium on Computer Architecture.
[14] Chita R. Das,et al. MediaWorm: A QoS Capable Router Architecture for Clusters , 2002, IEEE Trans. Parallel Distributed Syst..
[15] Reetuparna Das,et al. Application-to-core mapping policies to reduce memory interference in multi-core systems , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[16] Lixia Zhang,et al. Virtual Clock: A New Traffic Control Algorithm for Packet Switching Networks , 1990, SIGCOMM.
[17] J.H. Kim,et al. Rotating Combined Queueing (RCQ): Bandwidth and Latency Guarantees in Low-Cost, High-Performance Networks , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[18] Calvin Lin,et al. Adaptive History-Based Memory Schedulers , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[19] Chita R. Das,et al. Aérgia: exploiting packet latency slack in on-chip networks , 2010, ISCA.
[20] Anujan Varma,et al. Design and analysis of frame-based fair queueing: a new traffic scheduling algorithm for packet-switched networks , 1996, SIGMETRICS '96.
[21] William E. Weihl,et al. Lottery scheduling: flexible proportional-share resource management , 1994, OSDI '94.
[22] Niraj K. Jha,et al. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS , 2007, ICCD.
[23] Kees Goossens,et al. AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.
[24] Wen-mei W. Hwu,et al. Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing , 2012 .
[25] Gu-Yeon Wei,et al. Process Variation Tolerant 3T1D-Based Cache Architectures , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[26] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .
[27] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[28] Nan Jiang,et al. A detailed and flexible cycle-accurate Network-on-Chip simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[29] W. Dally,et al. Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).
[30] Wolf-Dietrich Weber,et al. A quality-of-service mechanism for interconnection networks in system-on-chips , 2005, Design, Automation and Test in Europe.
[31] Alexandra Fedorova,et al. AKULA: A toolset for experimenting and developing thread placement algorithms on multicore systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[32] Ganesh Lakshminarayana,et al. LOTTERYBUS: a new high-performance communication architecture for system-on-chip designs , 2001, DAC '01.
[33] A. Kumary,et al. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS , 2007 .