Pinpointing performance inefficiencies via lightweight variance profiling
暂无分享,去创建一个
Milind Chabbi | Xu Liu | Pengfei Su | Shuyin Jiao | Xu Liu | Milind Chabbi | Shuyin Jiao | Pengfei Su
[1] George Candea,et al. Efficient Tracing of Cold Code via Bias-Free Sampling , 2014, USENIX Annual Technical Conference.
[2] Shasha Wen,et al. Featherlight on-the-fly false-sharing detection , 2018, PPOPP.
[3] Thu D. Nguyen,et al. Exploiting Heterogeneity for Tail Latency and Energy Efficiency , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[4] Luiz De Rose,et al. Cray Performance Analysis Tools , 2008, Parallel Tools Workshop.
[5] Nathan R. Tallent,et al. HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..
[6] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[7] E. Tammaru,et al. Guidelines for creating a debuggable processor , 1982, ASPLOS I.
[8] Martin Schulz,et al. Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[9] Nicholas Nethercote,et al. Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.
[10] Robert J. Fowler,et al. HPCVIEW: A Tool for Top-down Analysis of Node Performance , 2002, The Journal of Supercomputing.
[11] SchulzMartin,et al. Open|SpeedShop: An open source infrastructure for parallel performance analysis , 2008 .
[12] Derek Bruening,et al. Efficient, transparent, and comprehensive runtime code manipulation , 2004 .
[13] John Byrne,et al. Watching for Software Inefficiencies with Witch , 2018, ASPLOS.
[14] Xin Liu,et al. A Highly Effective Global Surface Wave Numerical Simulation with Ultra-High Resolution , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[15] Robert Tappan Morris,et al. Locating cache performance bottlenecks using data profiling , 2010, EuroSys '10.
[16] Hao Xu,et al. Can we trust profiling results?: understanding and fixing the inaccuracy in modern profilers , 2019, ICS.
[17] Bernd Mohr,et al. The Scalasca performance toolset architecture , 2010, Concurr. Comput. Pract. Exp..
[18] Ronald G. Dreslinski,et al. Reining in Long Tails in Warehouse-Scale Computers with Quick Voltage Boosting Using Adrenaline , 2017, ACM Trans. Comput. Syst..
[19] Christian Bienia,et al. Benchmarking modern multiprocessors , 2011 .
[20] Mark Scott Johnson. Some requirements for architectural support of software debugging , 1982, ASPLOS I.
[21] Gregory R. Ganger,et al. Automated Diagnosis Without Predictability Is a Recipe for Failure , 2012, HotCloud.
[22] Thomas F. Wenisch,et al. Statistical Analysis of Latency Through Semantic Profiling , 2017, EuroSys.
[23] Hwanju Kim,et al. TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services , 2016, ASPLOS.
[24] John M. Mellor-Crummey,et al. A tool to analyze the performance of multithreaded programs on NUMA architectures , 2014, PPoPP '14.
[25] Milind Chabbi,et al. Pinpointing performance inefficiencies in Java , 2019, ESEC/SIGSOFT FSE.
[26] Wenguang Chen,et al. DRDDR: a lightweight method to detect data races in Linux kernel , 2016, The Journal of Supercomputing.
[27] Barzan Mozafari,et al. DBSherlock: A Performance Diagnostic Tool for Transactional Databases , 2016, SIGMOD Conference.
[28] Xu Liu,et al. Featherlight Reuse-Distance Measurement , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[29] Allen D. Malony,et al. The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..
[30] Nathan R. Tallent,et al. Binary analysis for measurement and attribution of program performance , 2009, PLDI '09.
[31] Sebastian Burckhardt,et al. Effective Data-Race Detection for the Kernel , 2010, OSDI.
[32] Emery D. Berger,et al. Coz: finding code that counts with causal profiling , 2015, USENIX Annual Technical Conference.
[33] Nathan R. Tallent,et al. Scalable Identification of Load Imbalance in Parallel Executions Using Call Path Profiles , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[34] Mona Attariyan,et al. X-ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software , 2012, OSDI.
[35] B. Welford. Note on a Method for Calculating Corrected Sums of Squares and Products , 1962 .
[36] Gregory R. Ganger,et al. Diagnosing Performance Changes by Comparing Request Flows , 2011, NSDI.
[37] Xiaoyin Wang,et al. CSOD: Context-Sensitive Overflow Detection , 2019, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[38] James R. Larus,et al. Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.
[39] Martin Schulz,et al. Reconciling Sampling and Direct Instrumentation for Unintrusive Call-Path Profiling of MPI Programs , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[40] Martin Schulz,et al. Open | SpeedShop: An open source infrastructure for parallel performance analysis , 2008, Sci. Program..
[41] John Mellor-Crummey,et al. Managing locality in grand challenge applications: a case study of the gyrokinetic toroidal code , 2008 .
[42] Emery D. Berger,et al. DoubleTake: Fast and Precise Error Detection via Evidence-Based Dynamic Analysis , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).
[43] Balaram Sinharoy,et al. IBM POWER7 performance modeling, verification, and evaluation , 2011 .
[44] Ricardo Bianchini,et al. Few-to-Many: Incremental Parallelism for Reducing Tail Latency in Interactive Services , 2015, ASPLOS.
[45] Susan L. Graham,et al. Gprof: A call graph execution profiler , 1982, SIGPLAN '82.
[46] Minming Li,et al. TailCutter: Wisely cutting tail latency in cloud CDN under cost constraints , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.