Symbolic Mapping of Loop Programs onto Processor Arrays
暂无分享,去创建一个
[1] J. Ramanujam,et al. Automatic C-to-CUDA Code Generation for Affine Programs , 2010, CC.
[2] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[3] Sanjay V. Rajopadhye,et al. Parameterized loop tiling , 2012, TOPL.
[4] Frédéric Vivien,et al. A constructive solution to the juggling problem in processor array synthesis , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[5] Yves Robert,et al. Linear scheduling is close to optimality , 1992, [1992] Proceedings of the International Conference on Application Specific Array Processors.
[6] Narayanan Vijaykrishnan,et al. Run-time adaption for highly-complex multi-core systems , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).
[7] Jürgen Teich,et al. Mapping a class of dependence algorithms to coarse-grained reconfigurable arrays: architectural parameters and methodology , 2006, Int. J. Embed. Syst..
[8] Frank Hannig,et al. Invasive Tightly-Coupled Processor Arrays , 2014, ACM Trans. Embed. Comput. Syst..
[9] Jürgen Teich,et al. Exact Partitioning of Affine Dependence Algorithms , 2002, Embedded Processor Design Challenges.
[10] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[11] Jürgen Teich,et al. Partitioning Processor Arrays under Resource Constraints , 1997, J. VLSI Signal Process..
[12] Jürgen Teich,et al. Decentralized dynamic resource management support for massively parallel processor arrays , 2011, ASAP 2011 - 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors.
[13] Jürgen Teich,et al. Scalable Many-Domain Power Gating in Coarse-Grained Reconfigurable Processor Arrays , 2011, IEEE Embedded Systems Letters.
[14] B. Ramakrishna Rau,et al. A Constructive Solution to the Juggling Problem in Systolic Array Synthesis , 2000 .
[15] Oscar H. Ibarra,et al. On symbolic scheduling and parallel complexity of loops , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.
[16] J. Ramanujam,et al. DynTile: Parametric tiled loop generation for parallel execution on multicore processors , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[17] Jürgen Teich,et al. A highly parameterizable parallel processor array architecture , 2006, 2006 IEEE International Conference on Field Programmable Technology.
[18] Jingling Xue,et al. Model-Driven Tile Size Selection for DOACROSS Loops on GPUs , 2011, Euro-Par.
[19] Yves Robert,et al. Affine-by-Statement Scheduling of Uniform and Affine Loop Nests over Parametric , 1995, J. Parallel Distributed Comput..
[20] Larry Carter,et al. Selecting tile shape for minimal execution time , 1999, SPAA '99.
[21] Jürgen Teich,et al. Towards Symbolic Run-Time Reconfiguration in Tightly-Coupled Processor Arrays , 2011, 2011 International Conference on Reconfigurable Computing and FPGAs.
[22] Jürgen Teich,et al. Hierarchical power management for adaptive tightly-coupled processor arrays , 2013, TODE.
[23] Jürgen Teich,et al. Distributed Resource Reservation in Massively Parallel Processor Arrays , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[24] Lothar Thiele,et al. On the design of piecewise regular processor arrays , 1989, IEEE International Symposium on Circuits and Systems,.
[25] Jürgen Teich,et al. System integration of tightly-coupled processor arrays using reconfigurable buffer structures , 2013, CF '13.
[26] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[27] Sanjay V. Rajopadhye,et al. Parameterized tiled loops for free , 2007, PLDI '07.
[28] Jürgen Teich,et al. Resource-aware programming and simulation of MPSoC architectures through extension of X10 , 2011, SCOPES.
[29] S. Mahlke,et al. Multicore compilation strategies and challenges , 2009, IEEE Signal Processing Magazine.
[30] Jürgen Teich,et al. Scheduling of partitioned regular algorithms on processor arrays with constrained resources , 1996, Proceedings of International Conference on Application Specific Systems, Architectures and Processors: ASAP '96.
[31] Steven S. Muchnick,et al. Advanced Compiler Design and Implementation , 1997 .
[32] I. Radivojevic,et al. Symbolic Scheduling Techniques , 1995, IEICE Trans. Inf. Syst..
[33] Jingling Xue,et al. Loop Tiling for Parallelism , 2000, Kluwer International Series in Engineering and Computer Science.
[34] Jürgen Teich,et al. Invasive computing - Concepts and overheads , 2012, Proceeding of the 2012 Forum on Specification and Design Languages.
[35] Frank Hannig,et al. Scheduling Techniques for High-Throughput Loop Accelerators , 2009 .
[36] Jürgen Teich,et al. Invasive Algorithms and Architectures Invasive Algorithmen und Architekturen , 2008, it Inf. Technol..
[37] Thomas Kailath,et al. Regular iterative algorithms and their implementation on processor arrays , 1988, Proc. IEEE.
[38] Jürgen Teich,et al. Invasive Computing: An Overview , 2011, Multiprocessor System-on-Chip.
[39] Jürgen Teich,et al. Symbolic parallelization of loop programs for massively parallel processor arrays , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.
[40] Jürgen Teich,et al. Loop program mapping and compact code generation for programmable hardware accelerators , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.
[41] Sriram Krishnamoorthy,et al. Parametric multi-level tiling of imperfectly nested loops , 2009, ICS.
[42] Weijia Shang,et al. Time Optimal Linear Schedules for Algorithms with Uniform Dependencies , 1991, IEEE Trans. Computers.
[43] Leslie Lamport,et al. The parallel execution of DO loops , 1974, CACM.
[44] Jürgen Teich,et al. PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications , 2008, ARC.
[45] Jingling Xue,et al. Automatic Parallelization of Tiled Loop Nests with Enhanced Fine-Grained Parallelism on GPUs , 2012, 2012 41st International Conference on Parallel Processing.