Dynamic trace-based analysis of vectorization potential of applications
暂无分享,去创建一个
P. Sadayappan | Atanas Rountev | Louis-Noël Pouchet | Mahesh Ravishankar | Naznin Fauzia | Justin Holewinski | Ragavendar Ramamurthi | A. Rountev | P. Sadayappan | Mahesh Ravishankar | L. Pouchet | Justin Holewinski | Naznin Fauzia | R. Ramamurthi
[1] Saturnino Garcia,et al. Kremlin: rethinking and rebooting gprof for the multicore age , 2011, PLDI '11.
[2] L. Rauchwerger,et al. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization , 1999, IEEE Trans. Parallel Distributed Syst..
[3] Gary S. Tyson,et al. The limits of instruction level parallelism in SPEC95 applications , 1999, CARN.
[4] Kevin B. Theobald,et al. On the limits of program parallelism and its smoothability , 1992, MICRO 1992.
[5] Todd M. Austin,et al. Dynamic dependency analysis of ordinary programs , 1992, ISCA '92.
[6] Manoj Kumar,et al. Measuring Parallelism in Computation-Intensive Scientific/Engineering Applications , 1988, IEEE Trans. Computers.
[7] Erez Petrank,et al. New Algorithms for SIMD Alignment , 2007, CC.
[8] Monica S. Lam,et al. Limits of control flow on parallelism , 1992, ISCA '92.
[9] Margaret Martonosi,et al. Limits and Graph Structure of Available Instruction-Level Parallelism (Research Note) , 2000, Euro-Par.
[10] Xiangyu Zhang,et al. Cost effective dynamic program slicing , 2004, PLDI '04.
[11] Ayal Zaks,et al. Auto-vectorization of interleaved data for SIMD , 2006, PLDI '06.
[12] Xiangyu Zhang,et al. Cost and precision tradeoffs of dynamic data slicing algorithms , 2005, TOPL.
[13] Scott A. Mahlke,et al. Uncovering hidden loop level parallelism in sequential applications , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[14] Alexandru Nicolau,et al. Measuring the Parallelism Available for Very Long Instruction Word Architectures , 1984, IEEE Transactions on Computers.
[15] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[16] David W. Wall,et al. Limits of instruction-level parallelism , 1991, ASPLOS IV.
[17] Saman P. Amarasinghe,et al. Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.
[18] Peng Wu,et al. Vectorization for SIMD architectures with alignment constraints , 2004, PLDI '04.
[19] Xiangyu Zhang,et al. Enabling tracing Of long-running multithreaded programs via dynamic execution reduction , 2007, ISSTA '07.
[20] Peng Wu,et al. Compiler-Driven Dependence Profiling to Guide Program Parallelization , 2008, LCPC.
[21] Rajiv Gupta,et al. Speculative Parallelization of Sequential Loops on Multicores , 2009, International Journal of Parallel Programming.
[22] Alan Mycroft,et al. Limits of parallelism using dynamic dependency graphs , 2009, WODA '09.
[23] Alan Mycroft,et al. Set-Congruence Dynamic Analysis for Thread-Level Speculation (TLS) , 2008, LCPC.
[24] Yun Zhang,et al. Revisiting the Sequential Programming Model for Multi-Core , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[25] James R. Larus,et al. Loop-Level Parallelism in Numeric and Symbolic Programs , 1993, IEEE Trans. Parallel Distributed Syst..
[26] Ken Kennedy,et al. Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .
[27] Michael F. P. O'Boyle,et al. Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping , 2009, PLDI '09.
[28] Rajiv Gupta,et al. Unified control flow and data dependence traces , 2007, TACO.
[29] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[30] P. Sadayappan,et al. Understanding parallelism-inhibiting dependences in sequential Java programs , 2010, 2010 IEEE International Conference on Software Maintenance.
[31] Lawrence Rauchwerger,et al. Measuring limits of parallelism and characterizing its vulnerability to resource constraints , 1993, Proceedings of the 26th Annual International Symposium on Microarchitecture.
[32] Xiangyu Zhang,et al. Whole execution traces and their applications , 2005, TACO.
[33] Rajiv Gupta,et al. Copy or Discard execution model for speculative parallelization on multicores , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[34] Xiaotong Zhuang,et al. Exploiting Parallelism with Dependence-Aware Scheduling , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.
[35] Andreas Zeller,et al. Profiling Java programs for parallelism , 2009, 2009 ICSE Workshop on Multicore Software Engineering.