暂无分享,去创建一个
Wei Wang | Jin Zhou | Tongping Liu | Xu Liu | Hui Guan | Xin Zhao | Wei Wang | Tongping Liu | Xu Liu | Xin Zhao | Jin Zhou | Hui Guan
[1] Vivien Quéma,et al. MemProf: A Memory Profiler for NUMA Multicore Systems , 2012, USENIX Annual Technical Conference.
[2] Emery D. Berger,et al. Coz: finding code that counts with causal profiling , 2015, USENIX Annual Technical Conference.
[3] Alexandra Fedorova,et al. A case for NUMA-aware contention management on multicore systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[4] Guangming Zeng,et al. SyncPerf: Categorizing, Detecting, and Diagnosing Synchronization Performance Bugs , 2017, EuroSys.
[5] Nicholas Nethercote,et al. Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.
[6] Philippe Olivier Alexandre Navaux,et al. Characterizing communication and page usage of parallel applications for thread and data mapping , 2015, Perform. Evaluation.
[7] Manuel Selva,et al. NumaMMA: NUMA MeMory Analyzer , 2018, ICPP.
[8] Kenjiro Taura,et al. PerfMemPlus: A Tool for Automatic Discovery of Memory Performance Problems , 2019, ISC.
[9] Collin McCurdy,et al. Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).
[10] Weng-Fai Wong,et al. Dynamic cache contention detection in multi-threaded applications , 2011, VEE '11.
[11] Rui Yang,et al. Profiling Directed NUMA Optimization on Linux Systems: A Case Study of the Gaussian Computational Chemistry Code , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[12] Gokcen Kestor,et al. RTHMS: a tool for data placement on hybrid memory system , 2017, ISMM.
[13] Chen Tian,et al. PREDATOR: predictive false sharing detection , 2014, PPoPP '14.
[14] Emery D. Berger,et al. SHERIFF: precise detection and automatic mitigation of false sharing , 2011, OOPSLA '11.
[15] Susan L. Graham,et al. Gprof: A call graph execution profiler , 1982, SIGPLAN '82.
[16] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[17] James C. Browne,et al. Enhancing performance optimization of multicore chips and multichip nodes with data structure metrics , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[18] Philippe Olivier Alexandre Navaux,et al. TABARNAC: visualizing and resolving memory access issues on NUMA architectures , 2015, VPA '15.
[19] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[20] Kathryn S. McKinley,et al. Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.
[21] Sébastien Valat,et al. NUMAPROF, A NUMA Memory Profiler , 2018, Euro-Par Workshops.
[22] Xu Liu,et al. Cheetah: Detecting false sharing efficiently and effectively , 2016, 2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[23] John M. Mellor-Crummey,et al. A tool to analyze the performance of multithreaded programs on NUMA architectures , 2014, PPoPP '14.
[24] Christian Bienia,et al. PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .
[25] Derek Bruening,et al. An infrastructure for adaptive dynamic optimization , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..
[26] Robert J. Fowler,et al. NUMA policies and their relation to memory architecture , 1991, ASPLOS IV.
[27] Hai Jin,et al. A Tool to Detect Performance Problems of Multi-threaded Programs on NUMA Systems , 2016, 2016 IEEE Trustcom/BigDataSE/ISPA.
[28] Derek Bruening,et al. AddressSanitizer: A Fast Address Sanity Checker , 2012, USENIX Annual Technical Conference.
[29] Christoph Lameter,et al. An overview of non-uniform memory access , 2013, CACM.