Complementing software pipelining with software thread integration
暂无分享,去创建一个
Won So | Alexander G. Dean | A. Dean | Won So
[1] Wen-mei W. Hwu,et al. Modulo scheduling of loops in control-intensive non-numeric programs , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[2] Philip H. Sweany,et al. Improving software pipelining with unroll-and-jam , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.
[3] Philip H. Sweany,et al. Optimizing loop performance for clustered VLIW architectures , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[4] John M. Mellor-Crummey,et al. FIAT: A Framework for Interprocedural Analysis and Transfomation , 1993, LCPC.
[5] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[6] William J. Dally,et al. Imagine: Media Processing with Streams , 2001, IEEE Micro.
[7] Ernst L. Leiss,et al. Modulo scheduling for the TMS320C6x VLIW DSP architecture , 1999, LCTES '99.
[8] K. Yelick,et al. Generating Permutation Instructions from a High-Level Description , 2004 .
[9] Alexander Aiken,et al. Perfect Pipelining: A New Loop Parallelization Technique , 1988, ESOP.
[10] StotzerEric,et al. Modulo scheduling for the TMS320C6x VLIW DSP architecture , 1999 .
[11] Henry Hoffmann,et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.
[12] Henry Hoffmann,et al. A stream compiler for communication-exposed architectures , 2002, ASPLOS X.
[13] Jian Wang,et al. GURPR—a method for global software pipelining , 1987, MICRO 20.
[14] Ken Kennedy,et al. Estimating Interlock and Improving Balance for Pipelined Architectures , 1988, J. Parallel Distributed Comput..
[15] Paul Le Guernic,et al. SIGNAL: A declarative language for synchronous programming of real-time systems , 1987, FPCA.
[16] Ken Kennedy,et al. Conversion of control dependence to data dependence , 1983, POPL '83.
[17] Gérard Berry,et al. The Esterel Synchronous Programming Language: Design, Semantics, Implementation , 1992, Sci. Comput. Program..
[18] Philip H. Sweany,et al. Loop fusion for clustered VLIW architectures , 2002, LCTES/SCOPES '02.
[19] Joe D. Warren,et al. The program dependence graph and its use in optimization , 1987, TOPL.
[20] Margarida F. Jacome,et al. Compiler-directed ILP extraction for clustered VLIW/EPIC machines: predication, speculation and modulo scheduling , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.
[21] Corinna G. Lee,et al. Software pipelining loops with conditional branches , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[22] Monica S. Lam,et al. Interprocedural Analysis for Parallelization , 1995, LCPC.
[23] Junqiang Sun,et al. Tms320c6000 cpu and instruction set reference guide , 2000 .
[24] Krishna Subramanian,et al. Enhanced modulo scheduling for loops with conditional branches , 1992, MICRO 25.
[25] Scott A. Mahlke,et al. Reverse If-Conversion , 1993, PLDI '93.
[26] David Grove,et al. Selective specialization for object-oriented languages , 1995, PLDI '95.
[27] Won So,et al. Procedure cloning and integration for converting parallelism from coarse to fine grain , 2003, Seventh Workshop on Interaction Between Compilers and Computer Architectures, 2003. INTERACT-7 2003. Proceedings..
[28] Pascal Raymond,et al. The synchronous data flow programming language LUSTRE , 1991, Proc. IEEE.
[29] Richard A. Huff,et al. Lifetime-sensitive modulo scheduling , 1993, PLDI '93.
[30] Joe D. Warren,et al. The program dependence graph and its use in optimization , 1984, TOPL.
[31] Robert Stephens,et al. A survey of stream processing , 1997, Acta Informatica.
[32] Steve Carr,et al. Unroll-and-jam using uniformly generated sets , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[33] John Paul Shen,et al. Techniques for software thread integration in real-time embedded systems , 1998, Proceedings 19th IEEE Real-Time Systems Symposium (Cat. No.98CB36279).
[34] John Paul Shen,et al. System-level issues for software thread integration: guest triggering and host selection , 1999, Proceedings 20th IEEE Real-Time Systems Symposium (Cat. No.99CB37054).
[35] Alexander G. Dean. Compiling for fine-grain concurrency: planning and performing software thread integration , 2002, Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.
[36] Bennett B. Goldberg,et al. Trimaran - An Infrastructure for Compiler Research in Instruction Level Parallelism , 1998 .
[37] Ken Kennedy,et al. A Methodology for Procedure Cloning , 1993, Computer languages.