Swizzle-Switch Networks for Many-Core Systems
暂无分享,去创建一个
David Blaauw | Reetuparna Das | Ronald G. Dreslinski | Sudhir Satpathy | Thomas F. Wenisch | Trevor N. Mudge | Dennis Sylvester | Geoffrey Blake | Nathaniel Ross Pinckney | Korey Sewell | Thomas Manville | Michael Cieslak | Sudhir K. Satpathy | T. Wenisch | T. Mudge | R. Dreslinski | R. Das | D. Blaauw | D. Sylvester | N. Pinckney | Korey Sewell | Thomas Manville | Michael Cieslak | G. Blake
[1] Anoop Gupta,et al. Parallel computer architecture - a hardware / software approach , 1998 .
[2] Hsien-Hsin S. Lee,et al. 3D-MAPS: 3D Massively parallel processor with stacked memory , 2012, 2012 IEEE International Solid-State Circuits Conference.
[3] Edward T. Grochowski,et al. Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[4] Nicola Concer,et al. Simulation and analysis of network on chip architectures: ring, spidergon and 2D mesh , 2006, Proceedings of the Design Automation & Test in Europe Conference.
[5] David Blaauw,et al. A 1.07 Tbit/s 128×128 swizzle network for SIMD processors , 2010, 2010 Symposium on VLSI Circuits.
[6] William J. Dally,et al. The BlackWidow High-Radix Clos Network , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[7] William J. Dally,et al. A delay model and speculative architecture for pipelined routers , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[8] Mark D. Hill,et al. Virtual hierarchies to support server consolidation , 2007, ISCA '07.
[9] William J. Dally,et al. Design tradeoffs for tiled CMP on-chip networks , 2006, ICS '06.
[10] Mike Galles. Spider: a high-speed network interconnect , 1997, IEEE Micro.
[11] Dionisios N. Pnevmatikatos,et al. VLSI micro-architectures for high-radix crossbar schedulers , 2011, Proceedings of the Fifth ACM/IEEE International Symposium.
[12] Onur Mutlu,et al. Preemptive Virtual Clock: A flexible, efficient, and cost-effective QOS scheme for networks-on-chip , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[13] Onur Mutlu,et al. Express Cube Topologies for on-Chip Interconnects , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[14] Hugh Garraway. Parallel Computer Architecture: A Hardware/Software Approach , 1999, IEEE Concurrency.
[15] Nick Baker,et al. Xbox 360 System Architecture , 2006, IEEE Micro.
[16] Dean M. Tullsen,et al. Interconnections in multi-core architectures: understanding mechanisms, overheads and scaling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[17] Martin Hopkins,et al. Synergistic Processing in Cell's Multicore Architecture , 2006, IEEE Micro.
[18] Tobias Bjerregaard,et al. A survey of research and practices of Network-on-chip , 2006, CSUR.
[19] Doug Burger,et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.
[20] Nick McKeown,et al. The iSLIP scheduling algorithm for input-queued switches , 1999, TNET.
[21] Yan Zhang,et al. Power and performance comparison of crossbars and buses as on-chip interconnect structures , 1999, Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems, and Computers (Cat. No.CH37020).
[22] Robert Patti,et al. Techniques for Producing 3D ICs with High-Density Interconnect , 2004 .
[23] Radu Marculescu,et al. Energy- and performance-aware mapping for regular NoC architectures , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[24] Michael Gschwind,et al. The IBM Blue Gene/Q Compute Chip , 2012, IEEE Micro.
[25] Saurabh Dighe,et al. A 48-Core IA-32 Processor in 45 nm CMOS Using On-Die Message-Passing and DVFS for Performance and Power Scaling , 2011, IEEE Journal of Solid-State Circuits.
[26] Karthik Ramani,et al. Interconnect-Aware Coherence Protocols for Chip Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[27] Timothy Mattson,et al. A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).
[28] William J. Dally,et al. Flattened Butterfly Topology for On-Chip Networks , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[29] José Duato,et al. A performance evaluation of 2D-mesh, ring, and crossbar interconnects for chip multi-processors , 2009, 2009 2nd International Workshop on Network on Chip Architectures.
[30] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .
[31] Henry Hoffmann,et al. On-Chip Interconnection Architecture of the Tile Processor , 2007, IEEE Micro.
[32] Krste Asanovic,et al. Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks , 2008, 2008 International Symposium on Computer Architecture.
[33] Natalie D. Enright Jerger,et al. Virtual Circuit Tree Multicasting: A Case for On-Chip Hardware Multicast Support , 2008, 2008 International Symposium on Computer Architecture.
[34] David A. Wood,et al. Variability in architectural simulations of multi-threaded workloads , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[35] George Kornaros. BCB: A Buffered CrossBar Switch Fabric Utilizing Shared Memory , 2006, 9th EUROMICRO Conference on Digital System Design (DSD'06).
[36] Simon W. Moore,et al. A communication characterisation of Splash-2 and Parsec , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[37] Trevor Mudge,et al. SWIFT: A 2.1Tb/s 32×32 self-arbitrating manycore interconnect fabric , 2011, 2011 Symposium on VLSI Circuits - Digest of Technical Papers.
[38] Guang R. Gao,et al. A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[39] Sharad Malik,et al. A power model for routers: modeling Alpha 21364 and InfiniBand routers , 2002, Proceedings 10th Symposium on High Performance Interconnects.
[40] William J. Dally,et al. Microarchitecture of a high radix router , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[41] David Blaauw,et al. Centip3De: A 3930DMIPS/W configurable near-threshold 3D stacked system with 64 ARM Cortex-M3 cores , 2012, 2012 IEEE International Solid-State Circuits Conference.
[42] Dionisios N. Pnevmatikatos,et al. A 128 x 128 x 24Gb/s Crossbar Interconnecting 128 Tiles in a Single Hop and Occupying 6% of Their Area , 2010, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip.
[43] Kevin Skadron,et al. Temperature-aware microarchitecture , 2003, ISCA '03.
[44] Chita R. Das,et al. Design and evaluation of a hierarchical on-chip interconnect for next-generation CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[45] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[46] Mahmut T. Kandemir,et al. CCC: crossbar connected caches for reducing energy consumption of on-chip multiprocessors , 2003, Euromicro Symposium on Digital System Design, 2003. Proceedings..
[47] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[48] George Michelogiannakis,et al. An analysis of on-chip interconnection networks for large-scale chip multiprocessors , 2010, TACO.
[49] Timothy Johnson,et al. An 8-core, 64-thread, 64-bit power efficient sparc soc (niagara2) , 2007, ISPD '07.
[50] Miltos D. Grammatikakis,et al. NoC Topologies Exploration based on Mapping and Simulation Models , 2007, 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools (DSD 2007).