Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests
暂无分享,去创建一个
Zhiru Zhang | Gai Liu | Ritchie Zhao | Mingxing Tan | Steve Dai | Mingxing Tan | Zhiru Zhang | Steve Dai | Gai Liu | Ritchie Zhao
[1] Peng Li,et al. Deadlock avoidance for streaming computations with filtering , 2010, SPAA '10.
[2] Fabrizio Ferrandi,et al. Exploiting Outer Loops Vectorization in High Level Synthesis , 2015, ARCS.
[3] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[4] Jason Cong,et al. High-Level Synthesis for FPGAs: From Prototyping to Deployment , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[5] Hans Jurgen Mattausch,et al. Fast quadratic increase of multiport-storage-cell area with port number , 1999 .
[6] Feng Liu,et al. CGPA: Coarse-Grained Pipelined Accelerators , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[7] Zhiru Zhang,et al. Area-efficient pipelining for FPGA-targeted high-level synthesis , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[8] Zhiru Zhang,et al. Mapping-Aware Constrained Scheduling for LUT-Based FPGAs , 2015, FPGA.
[9] Babak Falsafi,et al. Meet the walkers accelerating index traversals for in-memory databases , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[10] Zhiru Zhang,et al. ElasticFlow: A complexity-effective approach for pipelining irregular loop nests , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[11] Jason Helge Anderson,et al. LegUp: high-level synthesis for FPGA-based processor/accelerator systems , 2011, FPGA '11.
[12] Zhiru Zhang,et al. Multithreaded pipeline synthesis for data-parallel kernels , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[13] Ken Kennedy,et al. Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .
[14] Steven Derrien,et al. Runtime dependency analysis for loop pipelining in High-Level Synthesis , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[15] Zhiru Zhang,et al. Flushing-enabled loop pipelining for high-level synthesis , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[16] Jason Cong,et al. Polyhedral-based data reuse optimization for configurable computing , 2013, FPGA '13.
[17] George A. Constantinides,et al. High-level synthesis of dynamic data structures: A case study using Vivado HLS , 2013, 2013 International Conference on Field-Programmable Technology (FPT).
[18] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[19] J. Ramanujam,et al. Optimal software pipelining of nested loops , 1994, Proceedings of 8th International Parallel Processing Symposium.
[20] John Freeman,et al. OpenCL for FPGAs: Prototyping a Compiler , 2013 .
[21] Yosi Ben-Asher,et al. Reducing Memory Constraints in Modulo Scheduling Synthesis for FPGAs , 2010, TRETS.
[22] Brad Fitzpatrick,et al. Distributed caching with memcached , 2004 .
[23] B. Ramakrishna Rau,et al. Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.
[24] Zhiru Zhang,et al. SDC-based modulo scheduling for pipeline synthesis , 2013, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[25] Randolph E. Harr,et al. Efficient pipelining of nested loops: unroll-and-squash , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.