Symbolic parallelization of loop programs for massively parallel processor arrays
暂无分享,去创建一个
[1] Jürgen Teich,et al. Invasive Computing: An Overview , 2011, Multiprocessor System-on-Chip.
[2] Forrest Brewer,et al. On applicability of symbolic techniques to larger scheduling problems , 1995, Proceedings the European Design and Test Conference. ED&TC 1995.
[3] Lothar Thiele,et al. On the design of piecewise regular processor arrays , 1989, IEEE International Symposium on Circuits and Systems,.
[4] Oscar H. Ibarra,et al. On symbolic scheduling and parallel complexity of loops , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.
[5] J. Ramanujam,et al. Automatic C-to-CUDA Code Generation for Affine Programs , 2010, CC.
[6] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[7] Steven S. Muchnick,et al. Advanced Compiler Design and Implementation , 1997 .
[8] Leslie Lamport,et al. The parallel execution of DO loops , 1974, CACM.
[9] Jingling Xue,et al. Automatic Parallelization of Tiled Loop Nests with Enhanced Fine-Grained Parallelism on GPUs , 2012, 2012 41st International Conference on Parallel Processing.
[10] B. Ramakrishna Rau,et al. A Constructive Solution to the Juggling Problem in Systolic Array Synthesis , 2000 .
[11] Sriram Krishnamoorthy,et al. Parametric multi-level tiling of imperfectly nested loops , 2009, ICS.
[12] Sanjay V. Rajopadhye,et al. Parameterized loop tiling , 2012, TOPL.
[13] Frédéric Vivien,et al. A constructive solution to the juggling problem in processor array synthesis , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[14] Frank Hannig,et al. Scheduling Techniques for High-Throughput Loop Accelerators , 2009 .
[15] Jürgen Teich,et al. PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications , 2008, ARC.
[16] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[17] J. Ramanujam,et al. DynTile: Parametric tiled loop generation for parallel execution on multicore processors , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[18] Jürgen Teich,et al. A highly parameterizable parallel processor array architecture , 2006, 2006 IEEE International Conference on Field Programmable Technology.
[19] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[20] Sanjay V. Rajopadhye,et al. Parameterized tiled loops for free , 2007, PLDI '07.
[21] J. Ramanujam,et al. Parametric Tiling of Affine Loop Nests , 2010 .
[22] S. Mahlke,et al. Multicore compilation strategies and challenges , 2009, IEEE Signal Processing Magazine.
[23] Yves Robert,et al. Affine-by-Statement Scheduling of Uniform and Affine Loop Nests over Parametric , 1995, J. Parallel Distributed Comput..
[24] Larry Carter,et al. Selecting tile shape for minimal execution time , 1999, SPAA '99.
[25] I. Radivojevic,et al. Symbolic Scheduling Techniques , 1995, IEICE Trans. Inf. Syst..
[26] Jingling Xue,et al. Loop Tiling for Parallelism , 2000, Kluwer International Series in Engineering and Computer Science.