Invited Paper: A Compile-time Cost Model for OpenMP
暂无分享,去创建一个
[1] Alejandro Duran,et al. Dynamic load balancing of MPI+OpenMP applications , 2004, International Conference on Parallel Processing, 2004. ICPP 2004..
[2] A. Snavely,et al. Modeling application performance by convolving machine signatures with application profiles , 2001, Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538).
[3] Mary K. Vernon,et al. Parallel program performance prediction using deterministic task graph analysis , 2004, TOCS.
[4] Kathryn S. McKinley,et al. A Compiler Optimization Algorithm for Shared-Memory Multiprocessors , 1998, IEEE Trans. Parallel Distributed Syst..
[5] Sharad Malik,et al. Precise miss analysis for program transformations with caches of arbitrary associativity , 1998, ASPLOS VIII.
[6] Dieter an Mey,et al. Hybrid Parallelization of CFD Applications with Dynamic Thread Balancing , 2004, PARA.
[7] Mary K. Vernon,et al. Analytic evaluation of shared-memory systems with ILP processors , 1998, ISCA.
[8] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[9] Mihai Burcea,et al. An Adaptive OpenMP Loop Scheduler for Hyperthreaded SMPs , 2004, PDCS.
[10] Vivek Sarkar,et al. On Estimating and Enhancing Cache Effectiveness , 1991, LCPC.
[11] Ko-Yang Wang. Precise compile-time performance prediction for superscalar-based computers , 1994, PLDI '94.
[12] Mary K. Vernon,et al. Poems: end-to-end performance design of large parallel adaptive computational systems , 1998, WOSP '98.
[13] J. M. Bull,et al. Measuring Synchronisation and Scheduling Overheads in OpenMP , 2007 .
[14] Peter Naur. IFIP Working Group on ALGOL , 1962 .
[15] Barbara M. Chapman,et al. Evaluating OpenMP on Chip MultiThreading Platforms , 2005, IWOMP.
[16] Amer Diwan,et al. SUIF Explorer: an interactive and interprocedural parallelizer , 1999, PPoPP '99.
[17] Ruoming Jin,et al. A methodology for detailed performance modeling of reduction computations on SMP machines , 2005, Perform. Evaluation.
[18] Ruoming Jin,et al. Performance prediction for random write reductions: a case study in modeling shared memory programs , 2002, SIGMETRICS '02.
[19] Yan Solihin,et al. Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.
[20] Bruce M. Maggs,et al. Proceedings of the 28th Annual Hawaii International Conference on System Sciences- 1995 Models of Parallel Computation: A Survey and Synthesis , 2022 .
[21] Wenguang Chen,et al. OpenUH: an optimizing, portable OpenMP compiler , 2007, Concurr. Comput. Pract. Exp..
[22] Carl Staelin,et al. lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.
[23] Michael Voss,et al. Reducing parallel overheads through dynamic serialization , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.
[24] Michael E. Wolf,et al. Combining Loop Transformations Considering Caches and Scheduling , 2004, International Journal of Parallel Programming.
[25] John M. Mellor-Crummey,et al. Cross-architecture performance predictions for scientific applications using parameterized models , 2004, SIGMETRICS '04/Performance '04.
[26] Alan Jay Smith,et al. Analysis of benchmark characteristics and benchmark performance prediction , 1996, TOCS.
[27] Gang Ren,et al. A comparison of empirical and model-driven optimization , 2003, PLDI '03.
[28] Dingxing Wang,et al. ORC-OpenMP: An OpenMP Compiler Based on ORC , 2004, International Conference on Computational Science.
[29] Israel Koren,et al. An analytical model of high performance superscalar-based multiprocessors , 1995, PACT.