Real Time Cache Performance Analyzing for Multi-core Parallel Programs
暂无分享,去创建一个
Rui Wang | Yuan Gao | Guolu Zhang | Yuan Gao | Rui Wang | Guolu Zhang
[1] Xi Chen,et al. Cache contention and application performance prediction for multi-core systems , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).
[2] Kristof Beyls,et al. Reuse Distance as a Metric for Cache Behavior. , 2001 .
[3] Jack J. Dongarra,et al. Collecting Performance Data with PAPI-C , 2009, Parallel Tools Workshop.
[4] Sharad Malik,et al. Cache miss equations: a compiler framework for analyzing and tuning memory behavior , 1999, TOPL.
[5] Bruce Jacob,et al. The Memory System , 2017 .
[6] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[7] Norman P. Jouppi,et al. Multi-Core Cache Hierarchies , 2011, Multi-Core Cache Hierarchies.
[8] F. Wolf,et al. Performance Profiling and Analysis of DoD Applications Using PAPI and TAU , 2005, 2005 Users Group Conference (DOD-UGC'05).
[9] Trevor N. Mudge,et al. Trace-driven memory simulation: a survey , 1997, CSUR.
[10] Derek L. Schuff,et al. Multicore-aware reuse distance analysis , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[11] Nathan R. Tallent,et al. HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..
[12] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[13] Keshav Pingali,et al. Ordered vs. unordered: a comparison of parallelism and work-efficiency in irregular algorithms , 2011, PPoPP '11.
[14] Bruce Jacob,et al. Memory Systems: Cache, DRAM, Disk , 2007 .
[15] Ravi R. Iyer. On modeling and analyzing cache hierarchies using CASPER , 2003, 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003..
[16] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[17] Michael Frumkin,et al. The OpenMP Implementation of NAS Parallel Benchmarks and its Performance , 2013 .