Profile-Based Speculation

[1]  Sanjay J. Patel,et al.  rePLay: A Hardware Framework for Dynamic Program Optimization , 1999 .

[2]  Thomas M. Conte,et al.  Accurate and practical profile-driven compilation using the profile buffer , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[3]  Robert J. Hall,et al.  Call path profiling , 1992, International Conference on Software Engineering.

[4]  Youfeng Wu Strength reduction of multiplications by integer constants , 1995, SIGP.

[5]  Robert S. Cohn,et al.  Hot cold optimization of large Windows/NT applications , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[6]  Martin Hirzel,et al.  Bursty Tracing: A Framework for Low-Overhead Temporal Profiling , 2001 .

[7]  Scott A. Mahlke,et al.  Superblock formation using static program analysis , 1993, Proceedings of the 26th Annual International Symposium on Microarchitecture.

[8]  James R. Larus,et al.  Branch prediction for free , 1993, PLDI '93.

[9]  Joseph A. Fisher,et al.  Predicting conditional branch directions from previous runs of a program , 1992, ASPLOS V.

[10]  Youfeng Wu,et al.  Efficient discovery of regular stride patterns in irregular programs and its use in compiler prefetching , 2002, PLDI '02.

[11]  Thomas R. Gross,et al.  Avoidance and suppression of compensation code in a trace scheduling compiler , 1994, TOPL.

[12]  Vasanth Bala,et al.  Transparent Dynamic Optimization , 1999 .

[13]  K. Ebcioglu,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[14]  Yun Wang,et al.  IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[15]  Jack W. Davidson,et al.  Profile guided code positioning , 1990, SIGP.

[16]  James R. Larus,et al.  Static branch frequency and program profile analysis , 1994, MICRO 27.

[17]  Scott A. Mahlke,et al.  Using profile information to assist classic code optimizations , 1991, Softw. Pract. Exp..

[18]  James R. Larus,et al.  Cache-conscious structure definition , 1999, PLDI '99.

[19]  Wei-Chung Hsu,et al.  The performance of runtime data cache prefetching in a dynamic optimization system , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[20]  Michael D. Smith,et al.  Better global scheduling using path profiles , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[21]  Wei-Chung Hsu,et al.  On the predictability of program behavior using different input data sets , 2002, Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.

[22]  David A. Wood,et al.  Cache profiling and the SPEC benchmarks: a case study , 1994, Computer.

[23]  Youfeng Wu,et al.  Better exploration of region-level value locality with integrated computation reuse and value prediction , 2001, ISCA 2001.

[24]  Donald E. Knuth,et al.  Optimal measurement points for program frequency counts , 1973 .

[25]  Viktor K. Prasanna,et al.  Tiling, Block Data Layout, and Memory Hierarchy Performance , 2003, IEEE Trans. Parallel Distributed Syst..

[26]  Richard E. Hank,et al.  Region-based compilation: an introduction and motivation , 1995, MICRO 1995.

[27]  David R. Karger,et al.  Near-optimal intraprocedural branch alignment , 1997, PLDI '97.

[28]  David R. Kaeli,et al.  Analysis of Temporal-Based Program Behavior for Improved Instruction Cache Performance , 1999, IEEE Trans. Computers.

[29]  Richard Johnson,et al.  The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[30]  Zheng Wang,et al.  System support for automatic profiling and optimization , 1997, SOSP.

[31]  Wen-mei W. Hwu,et al.  Trace selection for compiling large C application programs to microcode , 1988, MICRO 1988.

[32]  James R. Larus,et al.  Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.

[33]  James E. Smith,et al.  Rapid profiling via stratified sampling , 2001, ISCA 2001.

[34]  Rastislav Bodík,et al.  An efficient profile-analysis framework for data-layout optimizations , 2002, POPL '02.

[35]  Scott Mahlke,et al.  Sentinel scheduling: a model for compiler-controlled speculative execution , 1993 .

[36]  Dror Rawitz,et al.  The hardness of cache conscious data placement , 2002, POPL '02.

[37]  Martin Hirzel,et al.  Dynamic hot data stream prefetching for general-purpose programs , 2002, PLDI '02.

[38]  Mauricio J. Serrano,et al.  Prefetch injection based on hardware monitoring and object metadata , 2004, PLDI '04.

[39]  Youfeng Wu,et al.  Continuous trip count profiling for loop optimization in two-phase dynamic binary translators , 2004, Eighth Workshop on Interaction between Compilers and Computer Architectures, 2004. INTERACT-8 2004..

[40]  Youfeng Wu,et al.  Accuracy of profile maintenance in optimizing compilers , 2002, Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.

[41]  Todd C. Mowry,et al.  Understanding Why Correlation Profiling Improves the Predictability of Data Cache Misses in Nonnumeric Applications , 2000, IEEE Trans. Computers.

[42]  Chandra Krintz,et al.  Cache-conscious data placement , 1998, ASPLOS VIII.

[43]  Avi Mendelson,et al.  Can program profiling support value prediction? , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[44]  James R. Larus,et al.  Efficient path profiling , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[45]  Scott A. Mahlke,et al.  The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.

[46]  Jignesh M. Patel,et al.  Call graph prefetching for database applications , 2003, TOCS.

[47]  Vasanth Bala,et al.  Software Profiling for Hot Path Prediction: Less is More , 2000, ASPLOS.

[48]  John Paul Shen,et al.  Post-pass binary adaptation for software-based speculative precomputation , 2002, PLDI '02.

[49]  Rajiv Gupta,et al.  Path profile guided partial dead code elimination using predication , 1997, Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques.

[50]  Brad Calder,et al.  Phase tracking and prediction , 2003, ISCA '03.

[51]  James R. Larus,et al.  Optimally profiling and tracing programs , 1994, TOPL.

[52]  Brad Calder,et al.  Value profiling , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[53]  Roy Dz-Ching Ju,et al.  A compiler framework for speculative analysis and optimizations , 2003, PLDI '03.

[54]  Youfeng Wu,et al.  The accuracy of initial prediction in two-phase dynamic binary translators , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[55]  Michael A. Harrison,et al.  Accurate static estimators for program optimization , 1994, PLDI '94.

[56]  Michael Gschwind,et al.  Dynamic Binary Translation and Optimization , 2001, IEEE Trans. Computers.

[57]  Kunle Olukotun,et al.  The Jrpm system for dynamically parallelizing Java programs , 2003, ISCA '03.

[58]  Wen-mei W. Hwu,et al.  Compiler-directed dynamic computation reuse: rationale and initial results , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.