Evaluating The Performance of Non-Blocking Synchronisation on Modern Shared-Memory Multiprocessors
暂无分享,去创建一个
[1] Ralph Grishman,et al. The NYU ultracomputer—designing a MIMD, shared-memory parallel machine , 2018, ISCA '98.
[2] Edward D. Lazowska,et al. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems , 1991, IEEE Trans. Parallel Distributed Syst..
[3] Beng-Hong Lim,et al. Reactive synchronization algorithms for multiprocessors , 1994, ASPLOS VI.
[4] Anoop Gupta,et al. Working sets, cache sizes, and node granularity issues for large-scale multiprocessors , 1993, ISCA '93.
[5] Marc Levoy,et al. Parallel visualization algorithms: performance and architectural implications , 1994, Computer.
[6] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[7] Marc Levoy,et al. Volume rendering on scalable shared-memory MIMD architectures , 1992, VVS.
[8] Pat Hanrahan,et al. A rapid hierarchical radiosity algorithm , 1991, SIGGRAPH.
[9] Michael L. Scott,et al. Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.
[10] Yi Zhang,et al. A simple, fast and scalable non-blocking concurrent FIFO queue for shared memory multiprocessor systems , 2001, SPAA '01.
[11] David R. O'Hallaron,et al. Earthquake ground motion modeling on parallel computers , 1996, Supercomputing '96.
[12] Dimitrios S. Nikolopoulos,et al. A quantitative architectural evaluation of synchronization algorithms and disciplines on ccNUMA systems: the case of the SGI Origin2000 , 1999, ICS '99.
[13] Maged M. Michael,et al. Nonblocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors , 1998, J. Parallel Distributed Comput..
[14] D. Brandt,et al. Multi-level adaptive solutions to boundary-value problems math comptr , 1977 .
[15] John L. Hennessy,et al. The performance advantages of integrating block data transfer in cache-coherent multiprocessors , 1994, ASPLOS VI.
[16] James R. Goodman,et al. Efficient Synchronization: Let Them Eat QOLB , 1997, International Symposium on Computer Architecture.
[17] Alexandre E. Eichenberger,et al. Impact of Load Imbalance on the Design of Software Barriers , 1995, ICPP.
[18] Anoop Gupta,et al. The DASH Prototype: Logic Overhead and Performance , 1993, IEEE Trans. Parallel Distributed Syst..
[19] Jaswinder Pal Singh,et al. A methodology and an evaluation of the SGI Origin2000 , 1998, SIGMETRICS '98/PERFORMANCE '98.
[20] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[21] Kenneth C. Yeager. The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.
[22] T. Lovett,et al. STiNG: A CC-NUMA Computer System for the Commercial Marketplace , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[23] David R. O'Hallaron. Spark98: Sparse Matrix Kernels for Shared Memory and Message Passing Systems , 1997 .
[24] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.