Featherlight on-the-fly false-sharing detection
暂无分享,去创建一个
[1] Chen Tian,et al. PREDATOR: predictive false sharing detection , 2014, PPoPP '14.
[2] Weng-Fai Wong,et al. Dynamic cache contention detection in multi-threaded applications , 2011, VEE '11.
[3] John M. Mellor-Crummey,et al. DeadSpy: a tool to pinpoint program inefficiencies , 2012, CGO '12.
[4] Shasha Wen,et al. An Efficient Abortable-locking Protocol for Multi-level NUMA Systems , 2017, PPoPP.
[5] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[6] Gerard J. Holzmann,et al. The Model Checker SPIN , 1997, IEEE Trans. Software Eng..
[7] Leslie Lamport,et al. Time, clocks, and the ordering of events in a distributed system , 1978, CACM.
[8] Christoforos E. Kozyrakis,et al. Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[9] Berkin Özisikyilmaz,et al. MineBench: A Benchmark Suite for Data Mining Workloads , 2006, 2006 IEEE International Symposium on Workload Characterization.
[10] E. Tammaru,et al. Guidelines for creating a debuggable processor , 1982, ASPLOS I.
[11] Emery D. Berger,et al. SHERIFF: precise detection and automatic mitigation of false sharing , 2011, OOPSLA '11.
[12] Robert Tappan Morris,et al. Locating cache performance bottlenecks using data profiling , 2010, EuroSys '10.
[13] Josef Weidendorfer,et al. Assessing cache false sharing effects by dynamic binary instrumentation , 2009, WBIA '09.
[14] Vincent Gramoli,et al. More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithms , 2015, PPoPP.
[15] Christian Bienia,et al. Benchmarking modern multiprocessors , 2011 .
[16] Michael L. Scott,et al. False sharing and its effect on shared memory performance , 1993 .
[17] Barbara M. Chapman,et al. Detecting False Sharing in OpenMP Applications Using the DARWIN Framework , 2011, LCPC.
[18] Shiliang Hu,et al. LASER: Light, Accurate Sharing dEtection and Repair , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[19] John Byrne,et al. Watching for Software Inefficiencies with Witch , 2018, ASPLOS.
[20] Nathan R. Tallent,et al. Binary analysis for measurement and attribution of program performance , 2009, PLDI '09.
[21] Leslie Lamport,et al. Concurrent reading and writing , 1977, Commun. ACM.
[22] Sandeep Koranne,et al. Boost C++ Libraries , 2011 .
[23] Shiliang Hu,et al. Remix: online detection and repair of cache contention for the JVM , 2016, PLDI.
[24] Mark Scott Johnson. Some requirements for architectural support of software debugging , 1982, ASPLOS I.
[25] Bo Wu,et al. ScaAnalyzer: a tool to identify memory scalability bottlenecks in parallel programs , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[26] James R. Larus,et al. Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.
[27] Yanbin Liu,et al. Detection of false sharing using machine learning , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[28] Robert Tappan Morris,et al. An Analysis of Linux Scalability to Many Cores , 2010, OSDI.
[29] Dutch T. Meyer,et al. Whose cache line is it anyway?: operating system support for live detection and repair of false sharing , 2013, EuroSys '13.
[30] Nathan Froyd,et al. Scalability analysis of SPMD codes using expectations , 2007, ICS '07.
[31] Balaram Sinharoy,et al. IBM POWER7 performance modeling, verification, and evaluation , 2011 .
[32] Susan L. Graham,et al. Gprof: A call graph execution profiler , 1982, SIGPLAN '82.
[33] Robert J. Hall,et al. Call path profiling , 1992, International Conference on Software Engineering.
[34] Dragan Bosnacki,et al. The Design of a Multicore Extension of the SPIN Model Checker , 2007, IEEE Transactions on Software Engineering.
[35] Nathan R. Tallent,et al. HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..
[36] Xu Liu,et al. Cheetah: Detecting false sharing efficiently and effectively , 2016, 2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[37] Shasha Wen,et al. REDSPY: Exploring Value Locality in Software , 2017, ASPLOS.