Automatic Compilation of Loops to Exploit Operator Parallelism on Configurable Arithmetic Logic Units
暂无分享,去创建一个
Santosh Pande | Ram Subramanian | Narasimhan Ramasubramanian | S. Pande | N. Ramasubramanian | R. Subramanian
[1] Mahmut T. Kandemir,et al. A Loop Transformation Algorithm Based on Explicit Data Layout Representation for Optimizing Locality , 1998, LCPC.
[2] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[3] Monica S. Lam,et al. Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.
[4] Harvey F. Silverman,et al. Processor reconfiguration through instruction-set metamorphosis , 1993, Computer.
[5] John Wawrzynek,et al. Instruction-Level Parallelism for Reconfigurable Computing , 1998, FPL.
[6] Thomas Fahringer. Estimating and Optimizing Performance for Parallel Programs , 1995, Computer.
[7] Brad L. Hutchings,et al. Supporting FPGA microprocessors through retargetable software tools , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.
[8] Guy E. Blelloch,et al. Solving Linear Recurrences with Loop Raking , 1995, J. Parallel Distributed Comput..
[9] Mahmut Kandemir,et al. An Iteration Space Transformation Algorithm Based on Explicit Data Layout Representation for Optimizing Locality , 1999 .
[10] Keshav Pingali,et al. Data-centric multi-level blocking , 1997, PLDI '97.
[11] Brad L. Hutchings,et al. A dynamic instruction set computer , 1995, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.
[12] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[13] Santosh Pande. A compile time partitioning method for DOALL loops on distributed memory systems , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.
[14] Vivek Sarkar,et al. Baring It All to Software: Raw Machines , 1997, Computer.
[15] A. Smith,et al. PRISM-II compiler and architecture , 1993, [1993] Proceedings IEEE Workshop on FPGAs for Custom Computing Machines.
[16] Joe D. Warren,et al. The program dependence graph and its use in optimization , 1987, TOPL.
[17] David A. Padua,et al. Automatic Array Privatization , 1993, Compiler Optimizations for Scalable Parallel Systems Languages.
[18] Santosh Pande,et al. Automatic Analysis of Loops to Exploit Operator Parallelism on Reconfigurable Systems , 1998, LCPC.
[19] John Wawrzynek,et al. Garp: a MIPS processor with a reconfigurable coprocessor , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).
[20] Amos R. Omondi,et al. Computer arithmetic systems - algorithms, architecture and implementation , 1994, Prentice Hall International series in computer science.
[21] Utpal Banerjee,et al. Loop Transformations for Restructuring Compilers: The Foundations , 1993, Springer US.
[22] Maya Gokhale,et al. Malleable architecture generator for FPGA computing , 1996, Other Conferences.
[23] Rice UniversityCORPORATE,et al. High performance Fortran language specification , 1993 .
[24] Geoffrey Brown,et al. A software development system for FPGA-based data acquisition systems , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.
[25] Viktor K. Prasanna,et al. Seeking Solutions in Configurable Computing , 1997, Computer.
[26] Carl Ebeling,et al. Specifying and compiling applications for RaPiD , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).
[27] P. Sadayappan,et al. An approach to communication-efficient data redistribution , 1994, ICS '94.
[28] Keshav Pingali,et al. Solving Alignment Using Elementary Linear Algebra , 2001, Compiler Optimizations for Scalable Parallel Systems Languages.
[29] Zhiyuan Li,et al. Configuration compression for the Xilinx XC6200 FPGA , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).
[30] Krishna V. Palem,et al. Adaptive explicitly parallel instruction computing , 2001 .