Supporting efficient collective communication in NoCs
暂无分享,去创建一个
[1] Natalie D. Enright Jerger. SigNet: Network-on-chip filtering for coarse vector directories , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[2] Karthikeyan Sankaralingam,et al. Implementation and Evaluation of a Dynamically Routed Processor Operand Network , 2007, First International Symposium on Networks-on-Chip (NOCS'07).
[3] Chita R. Das,et al. A case for heterogeneous on-chip interconnects for CMPs , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[4] Jeffrey T. Draper,et al. Multicast routing with dynamic packet fragmentation , 2009, GLSVLSI '09.
[5] David F. Heidel,et al. An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[6] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[7] Li-Shiuan Peh,et al. Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[8] W. Daniel Hillis,et al. The Network Architecture of the Connection Machine CM-5 , 1996, J. Parallel Distributed Comput..
[9] Natalie D. Enright Jerger,et al. DBAR: An efficient routing algorithm to support multiple concurrent applications in networks-on-chip , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[10] José Duato,et al. Efficient unicast and multicast support for CMPs , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[11] Anoop Gupta,et al. Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes , 1990, ICPP.
[12] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .
[13] Ieee Xiang,et al. The TianHe-1A Supercomputer: Its Hardware and Software , 2011 .
[14] Valentin Puente,et al. MRR: Enabling fully adaptive multicast routing for CMP interconnection networks , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[15] Natalie D. Enright Jerger,et al. Virtual Circuit Tree Multicasting: A Case for On-Chip Hardware Multicast Support , 2008, 2008 International Symposium on Computer Architecture.
[16] Natalie D. Enright Jerger,et al. On-Chip Networks , 2009, On-Chip Networks.
[17] Natalie D. Enright Jerger,et al. Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[18] Rahul Boyapati,et al. Efficient lookahead routing and header compression for multicasting in networks-on-chip , 2010, 2010 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).
[19] Axel Jantsch,et al. Connection-oriented multicasting in wormhole-switched networks on chip , 2006, IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures (ISVLSI'06).
[20] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.
[21] A. Kumary,et al. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS , 2007 .
[22] Yingtao Jiang,et al. On an efficient NoC multicasting scheme in support of multiple applications running on irregular sub-networks , 2011, Microprocess. Microsystems.
[23] Ran Ginosar,et al. The Power of Priority: NoC Based Distributed Cache Coherency , 2007, First International Symposium on Networks-on-Chip (NOCS'07).
[24] Chita R. Das,et al. A low latency router supporting adaptivity for on-chip interconnects , 2005, Proceedings. 42nd Design Automation Conference, 2005..
[25] Hyungjun Kim,et al. Recursive partitioning multicast: A bandwidth-efficient routing for Networks-on-Chip , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.
[26] William J. Dally,et al. A delay model and speculative architecture for pipelined routers , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[27] José Duato,et al. A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..
[28] George Michelogiannakis,et al. Evaluating Bufferless Flow Control for On-chip Networks , 2010, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip.
[29] Manfred Glesner,et al. New Theory for Deadlock-Free Multicast Routing in Wormhole-Switched Virtual-Channelless Networks-on-Chip , 2011, IEEE Transactions on Parallel and Distributed Systems.
[30] Dhabaleswar K. Panda. Fast barrier synchronization in wormhole k-ary n-cube networks with multidestination worms , 1995, Future Gener. Comput. Syst..
[31] Stephen W. Keckler,et al. Regional congestion awareness for load balance in networks-on-chip , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[32] Hong Xu,et al. Efficient implementation of barrier synchronization in wormhole-routed hypercube multicomputers , 1992, [1992] Proceedings of the 12th International Conference on Distributed Computing Systems.
[33] Norman P. Jouppi,et al. CACTI 6.0: A Tool to Model Large Caches , 2009 .
[34] Anoop Gupta,et al. The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.
[35] Mark Horowitz,et al. Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.
[36] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[37] Sudhakar Yalamanchili,et al. Interconnection Networks: An Engineering Approach , 2002 .
[38] Milos Prvulovic,et al. TLSync: Support for multiple fast barriers using on-chip transmission lines , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[39] Niraj K. Jha,et al. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS , 2007, ICCD.
[40] Lionel M. Ni,et al. Multi-address Encoding for Multicast , 1994, PCRCW.
[41] Ralph Grishman,et al. The NYU ultracomputer—designing a MIMD, shared-memory parallel machine , 2018, ISCA '98.