Characterizing Active Data Sharing in Threaded Applications Using Shared Footprint
暂无分享,去创建一个
[1] Peter J. Denning,et al. Properties of the working-set model , 1972, CACM.
[2] Michael L. Scott,et al. False sharing and its effect on shared memory performance , 1993 .
[3] Laxmi N. Bhuyan,et al. No More Backstabbing... A Faithful Scheduling Policy for Multithreaded Programs , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[4] Donald Yeung,et al. Identifying optimal multicore cache hierarchies for loop-based parallel programs via reuse distance analysis , 2012, MSPC '12.
[5] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[6] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[7] Chen Ding,et al. Linear-time Modeling of Program Working Set in Shared Cache , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[8] Donald Yeung,et al. Coherent Profiles: Enabling Efficient Reuse Distance Analysis of Multicore Scaling for Loop-based Parallel Programs , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[9] Milind Kulkarni,et al. Accelerating multicore reuse distance analysis with sampling and parallelization , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[10] Dean M. Tullsen,et al. Compiler Techniques for Reducing Data Cache Miss Rate on a Multithreaded Architecture , 2008, HiPEAC.
[11] Xipeng Shen,et al. The Significance of CMP Cache Sharing on Contemporary Multithreaded Applications , 2012, IEEE Transactions on Parallel and Distributed Systems.
[12] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[13] Derek L. Schuff,et al. Multicore-aware reuse distance analysis , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[14] Aamer Jaleel,et al. Last level cache (LLC) performance of data mining workloads on a CMP - a case study of parallel bioinformatics workloads , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[15] Christian Bienia,et al. Benchmarking modern multiprocessors , 2011 .
[16] Luiz André Barroso,et al. Memory system characterization of commercial workloads , 1998, ISCA.