Compact Code Generation for Tightly-Coupled Processor Arrays
暂无分享,去创建一个
[1] Lothar Thiele,et al. Resource constrained scheduling of uniform algorithms , 1993, J. VLSI Signal Process..
[2] Jürgen Teich,et al. Accuracy and performance analysis of Harris Corner computation on tightly-coupled processor arrays , 2013, 2013 Conference on Design and Architectures for Signal and Image Processing.
[3] Rudy Lauwereins,et al. Design methodology for a tightly coupled VLIW/reconfigurable matrix architecture: a case study , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.
[4] Paul Feautrier,et al. Polyhedron Model , 2011, Encyclopedia of Parallel Computing.
[5] Jürgen Teich. A compiler for application specific processor arrays , 1993 .
[6] Rudy Lauwereins,et al. DRESC: a retargetable compiler for coarse-grained reconfigurable architectures , 2002, 2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings..
[7] Jürgen Teich,et al. Resource constrained and speculative scheduling of an algorithm class with run-time dependent conditionals , 2004 .
[8] Christian Lengauer,et al. Towards systolizing compilation , 1991, Distributed Computing.
[9] Jürgen Teich,et al. Mapping a class of dependence algorithms to coarse-grained reconfigurable arrays: architectural parameters and methodology , 2006, Int. J. Embed. Syst..
[10] Jürgen Teich,et al. A highly parameterizable parallel processor array architecture , 2006, 2006 IEEE International Conference on Field Programmable Technology.
[11] Jürgen Teich,et al. Loop program mapping and compact code generation for programmable hardware accelerators , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.
[12] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[13] Fadi J. Kurdahi,et al. MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications , 2000, IEEE Trans. Computers.
[14] Lothar Thiele,et al. On the hierarchical design of VLSI processor arrays , 1988, 1988., IEEE International Symposium on Circuits and Systems.
[15] Jürgen Teich,et al. Partitioning of processor arrays: a piecewise regular approach , 1993, Integr..
[16] Jürgen Teich,et al. A prototype of an invasive tightly-coupled processor array , 2012, Proceedings of the 2012 Conference on Design and Architectures for Signal and Image Processing.
[17] Jürgen Teich,et al. Hierarchical Partitioning for Piecewise Linear Algorithms , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).
[18] Kiyoung Choi,et al. An algorithm for mapping loops onto coarse-grained reconfigurable architectures , 2003 .
[19] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[20] Sumit Gupta,et al. SPARK: A Parallelizing Approach to the High-Level Synthesis of Digital Circuits , 2004 .
[21] David B. Whalley,et al. Effective exploitation of a zero overhead loop buffer , 1999, LCTES '99.
[22] Jürgen Teich,et al. A Dynamically Reconfigurable Weakly Programmable Processor Array Architecture Template , 2006, ReCoSoC.
[23] Fadi J. Kurdahi,et al. Automatic compilation to a coarse-grained reconfigurable system-opn-chip , 2003, TECS.
[24] Luca Benini,et al. Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications , 2012, DAC Design Automation Conference 2012.
[25] Jürgen Teich,et al. The PAULA Language for Designing Multi-Dimensional Dataflow-Intensive Applications , 2008, MBMV.
[26] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[27] Frank Hannig,et al. Invasive Tightly-Coupled Processor Arrays , 2014, ACM Trans. Embed. Comput. Syst..
[28] David Padua,et al. Encyclopedia of Parallel Computing , 2011 .
[29] Jürgen Teich,et al. Partitioning Processor Arrays under Resource Constraints , 1997, J. VLSI Signal Process..
[30] D.I. Moldovan,et al. On the design of algorithms for VLSI systolic arrays , 1983, Proceedings of the IEEE.
[31] B. Ramakrishna Rau,et al. Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.
[32] Jürgen Teich,et al. PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications , 2008, ARC.
[33] Jürgen Teich,et al. High-Level Synthesis Revised - Generation of FPGA Accelerators from a Domain-Specific Language using the Polyhedron Model , 2013, PARCO.
[34] Francky Catthoor,et al. Compilation Technique for Loop Overhead Minimization , 2009, 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools.
[35] Frank Hannig,et al. Scheduling Techniques for High-Throughput Loop Accelerators , 2009 .
[36] Christian Lengauer,et al. Loop Parallelization in the Polytope Model , 1993, CONCUR.