Loop fusion for clustered VLIW architectures
暂无分享,去创建一个
Philip H. Sweany | Yi Qian | Steve Carr | S. Carr | P. Sweany | Yi Qian
[1] D.A. Reed,et al. An Integrated Compilation and Performance Analysis Environment for Data Parallel Programs , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[2] Philip H. Sweany,et al. Register assignment for software pipelining with partitioned register banks , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[3] Antonio González,et al. Instruction scheduling for clustered VLIW architectures , 2000, ISSS '00.
[4] Alexander Aiken,et al. Optimal loop parallelization , 1988, PLDI '88.
[5] Ken Kennedy,et al. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.
[6] Ken Kennedy,et al. Improving the ratio of memory operations to floating-point operations in loops , 1994, TOPL.
[7] Antonio González,et al. The effectiveness of loop unrolling for modulo scheduling in clustered VLIW architectures , 2000, Proceedings 2000 International Conference on Parallel Processing.
[8] Philip H. Sweany,et al. Global register partitioning , 2000, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622).
[9] David A. Poplawski. The unlimited resource machine (urm) , 1995 .
[10] Philip H. Sweany,et al. Loop Transformations for Architectures with Partitioned Register Banks , 2001, OM '01.
[11] Ken Kennedy,et al. RETROSPECTIVE: Coloring Heuristics for Register Allocation , 2022 .
[12] Vicki H. Allan,et al. Software pipelining , 1995, CSUR.
[13] Alexandre E. Eichenberger,et al. Effective cluster assignment for modulo scheduling , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[14] M. Rajagopalan,et al. Software Pipelining: Petri Net Pacemaker , 1993, Architectures and Compilation Techniques for Fine and Medium Grain Parallelism.
[15] Thomas M. Conte,et al. Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[16] Philip H. Sweany,et al. Value cloning for architectures with partitioned register banks , 1998 .
[17] B. Ramakrishna Rau,et al. Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.