HPCVIEW: A Tool for Top-down Analysis of Node Performance
暂无分享,去创建一个
Robert J. Fowler | Nathan R. Tallent | John M. Mellor-Crummey | Gabriel Marin | R. Fowler | J. Mellor-Crummey | G. Marin
[1] Luiz De Rose. The Hardware Performance Monitor Toolkit , 2001, Euro-Par.
[2] S. Turner,et al. Performance Analysis Using the MIPS R10000 Performance Counters , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[3] D.A. Reed,et al. An Integrated Compilation and Performance Analysis Environment for Data Parallel Programs , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[4] Lance M. Berc,et al. Continuous profiling: where have all the cycles gone? , 1997, ACM Trans. Comput. Syst..
[5] Lance M. Berc,et al. Continuous profiling: where have all the cycles gone? , 1997, TOCS.
[6] Jeffrey Dean,et al. ProfileMe: hardware support for instruction-level profiling on out-of-order processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[7] Yong Luo,et al. Instruction-Level Microprocessor Modeling of Scientific Applications , 1999, ISHPC.
[8] Donald E. Knuth,et al. Optimal measurement points for program frequency counts , 1973 .
[9] James R. Larus,et al. EEL: machine-independent executable editing , 1995, PLDI '95.
[10] Robert E. Tarjan. Testing flow graph reducibility , 1973, STOC '73.
[11] Wagner Meira,et al. Waiting time analysis and performance visualization in Carnival , 1996, SPDT '96.
[12] David B. Whalley,et al. Tools for application-oriented performance tuning , 2001, ICS '01.
[13] Paul Havlak,et al. Nesting of reducible and irreducible loops , 1997, TOPL.
[14] John L. Hennessy,et al. MTOOL: A Method for Isolating Memory Bottlenecks in Shared Memory Multiprocessor Programs , 1991, ICPP.
[15] Ying Zhang,et al. SvPablo: A Multi-language Performance Analysis System , 1998, Computer Performance Evaluation.
[16] Margaret Martonosi,et al. Integrating performance monitoring and communication in parallel computers , 1996, SIGMETRICS '96.
[17] David B. Whalley,et al. On providing useful information for analyzing and tuning applications , 2001, SIGMETRICS '01.
[18] Ken Kennedy,et al. Improving memory hierarchy performance for irregular applications , 1999, ICS '99.
[19] Thomas J. LeBlanc,et al. Parallel performance prediction using lost cycles analysis , 1994, Proceedings of Supercomputing '94.
[20] Ken Kennedy,et al. Estimating Interlock and Improving Balance for Pipelined Architectures , 1988, J. Parallel Distributed Comput..