Integrating Cache Performance Modeling and Tuning Support in Parallelization Tools
暂无分享,去创建一个
[1] Mahmut T. Kandemir,et al. Changing Interaction of Compiler and Architecture , 1997, Computer.
[2] Yong Luo,et al. Development and Validation of a Hierarchical Memory Model Incorporating CPU- and Memory-Operation Overlap , 1997 .
[3] Mary K. Vernon,et al. Poems: end-to-end performance design of large parallel adaptive computational systems , 1998, WOSP '98.
[4] D.A. Reed,et al. An Integrated Compilation and Performance Analysis Environment for Data Parallel Programs , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[5] Mary K. Vernon,et al. LoPC: modeling contention in parallel algorithms , 1997, PPOPP '97.
[6] Dionisios N. Pnevmatikatos,et al. Cache performance of the SPEC92 benchmark suite , 1993, IEEE Micro.
[7] Adolfy Hoisie,et al. Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications , 2000, Int. J. High Perform. Comput. Appl..
[8] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[9] James R. Larus,et al. Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.
[10] Margaret Martonosi,et al. Integrating performance monitoring and communication in parallel computers , 1996, SIGMETRICS '96.
[11] John L. Hennessy,et al. The accuracy of trace-driven simulations of multiprocessors , 1993, SIGMETRICS '93.
[12] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[13] Pei Cao,et al. Adaptive page replacement based on memory reference behavior , 1997, SIGMETRICS '97.
[14] Elana D. Granston,et al. A Cache Visualization Tool , 1997, Computer.
[15] Margaret Martonosi,et al. Tuning Memory Performance of Sequential and Parallel Programs , 1995, Computer.
[16] Sharad Malik,et al. Cache miss equations: an analytical representation of cache misses , 1997, ICS '97.
[17] M. Martonosi,et al. Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[18] John L. Hennessy,et al. Mtool: An Integrated System for Performance Debugging Shared Memory Multiprocessor Applications , 1993, IEEE Trans. Parallel Distributed Syst..
[19] S. Turner,et al. Performance Analysis Using the MIPS R10000 Performance Counters , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[20] Cos S. Ierotheou,et al. Computer Aided Parallelisation Tools (CAPTools) - Conceptual Overview and Performance on the Parallelisation of Structured Mesh Codes , 1996, Parallel Comput..