Interactive locality optimization on NUMA architectures
暂无分享,去创建一个
[1] Martin Schulz,et al. Improving Data Locality Using Dynamic Page Migration Based on Memory Access Histograms , 2002, International Conference on Computational Science.
[2] D.A. Reed,et al. Scalable performance analysis: the Pablo performance analysis environment , 1993, Proceedings of Scalable Parallel Libraries Conference.
[3] Katherine A. Yelick,et al. Analyses and Optimizations for Shared Address Space Programs , 1996, J. Parallel Distributed Comput..
[4] Hermann Hellwagner,et al. SCI: Scalable Coherent Interface: Architecture and Software for High-Performance Compute Clusters , 1999 .
[5] Emilio L. Zapata,et al. An Automatic Iteration/Data Distribution Method Based on Access Descriptors for DSMM , 1999, LCPC.
[6] Barton P. Miller,et al. The Paradyn Parallel Performance Measurement Tool , 1995, Computer.
[7] Anoop Gupta,et al. The Stanford FLASH Multiprocessor , 1994, ISCA.
[8] Roland Wismüller. Interoperability Support in the Distributed Monitoring System OCM , 1999 .
[9] Gordon Stoll,et al. Performance analysis and visualization of parallel systems using SimOS and Rivet: a case study , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[10] James K. Archibald. A cache coherence approach for large multiprocessor systems , 1988, ICS '88.
[11] Barton P. Miller,et al. IPS-2: The Second Generation of a Parallel Program Measurement System , 1990, IEEE Trans. Parallel Distributed Syst..
[12] Martin Schulz,et al. A simulation tool for evaluating shared memory systems , 2003, 36th Annual Simulation Symposium, 2003..
[13] Martin Schulz,et al. Visualizing the Memory Access Behavior of Shared Memory Applications on NUMA Architectures , 2001, International Conference on Computational Science.
[14] B. Miller,et al. The Paradyn Parallel Performance Measurement Tools , 1995 .
[15] J. Tao,et al. Improving the Scalability of Shared Memory Systems through Relaxed Consistency , 2002 .
[16] Martin Schulz,et al. Design and Implementation Aspects for the SMiLE Hardware Monitor , 2000 .
[17] Guy Lemieux,et al. Design and implementation of the NUMAchine multiprocessor , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).
[18] Anoop Gupta,et al. The Stanford FLASH multiprocessor , 1994, ISCA '94.
[19] Michael T. Heath,et al. Visualizing the performance of parallel programs , 1991, IEEE Software.
[20] Martin Schulz,et al. SMiLE: An Integrated, Multi-Paradigm Software Infrastructure for SCI-Based Clusters , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).
[21] Michael Oberhuber,et al. The Tool-set Project: Towards an Integrated Tool Environment for Parallel Programming , 1997 .
[22] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[23] Marco Zagha,et al. OriginTM 2000 and Onyx2® Performance Tuning and Optimization Guide , 1993 .
[24] Daniel A. Reed,et al. An approach to immersive performance visualization of parallel and wide-area distributed applications , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).