Efficient task spawning for shared memory and message passing in many-core architectures
暂无分享,去创建一个
Jürgen Teich | Andreas Weichslgartner | Jürgen Becker | Andreas Herkersdorf | Thomas Wild | Jan Heisswolf | Aurang Zaib
[1] R. Schaller,et al. Moore's law: past, present and future , 1997 .
[2] Massimo Ruo Roch,et al. MEDEA: a hybrid shared-memory/message-passing multiprocessor NoC-based architecture , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[3] Anoop Gupta,et al. The Stanford FLASH Multiprocessor , 1994, ISCA.
[4] Bill Nitzberg,et al. Distributed shared memory: a survey of issues and algorithms , 1991, Computer.
[5] Jürgen Teich,et al. Network Interface with Task Spawning Support for NoC-Based DSM Architectures , 2015, ARCS.
[6] André Schiper,et al. High-Throughput Maps on Message-Passing Manycore Architectures: Partitioning versus Replication , 2014, Euro-Par.
[7] Jason Duell,et al. Productivity and performance using partitioned global address space languages , 2007, PASCO '07.
[8] Donald Yeung,et al. THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR , 1991 .
[9] Luca Benini,et al. Networks on Chips : A New SoC Paradigm , 2022 .
[10] Dimitrios S. Nikolopoulos,et al. On-chip communication and synchronization mechanisms with cache-integrated network interfaces , 2010, Conf. Computing Frontiers.
[11] James Reinders,et al. Intel Xeon Phi Coprocessor High Performance Programming , 2013 .
[12] S.K. Reinhardt,et al. Decoupled Hardware Support for Distributed Shared Memory , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[13] Massoud Pedram,et al. A Novel Synthetic Traffic Pattern for Power/Performance Analysis of Network-on-Chips Using Negative Exponential Distribution , 2009, J. Low Power Electron..
[14] Timothy G. Mattson,et al. Light-weight communications on Intel's single-chip cloud computer processor , 2011, OPSR.
[15] Carl Ramey,et al. TILE-Gx100 ManyCore processor: Acceleration interfaces and architecture , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).
[16] Mary K. Vernon,et al. Comparison of hardware and software cache coherence schemes , 1991, ISCA '91.
[17] Jürgen Becker,et al. Providing multiple hard latency and throughput guarantees for packet switching networks on chip , 2013, Comput. Electr. Eng..
[18] Saurabh Dighe,et al. The 48-core SCC Processor: the Programmer's View , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[19] Om Prakash Gangwal,et al. An efficient on-chip NI offering guaranteed services, shared-memory abstraction, and flexible network configuration , 2005 .
[20] Shuming Chen,et al. Run-Time Partitioning of Hybrid Distributed Shared Memory on Multi-core Network-on-Chips , 2010, 2010 3rd International Symposium on Parallel Architectures, Algorithms and Programming.
[21] Shuming Chen,et al. Supporting Distributed Shared Memory on multi-core Network-on-Chips using a dual microcoded controller , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[22] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[23] Timothy Mattson,et al. A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).