Trade-offs in loop transformations
暂无分享,去创建一个
[1] H.J. De Man,et al. Modeling data flow and control flow for high level memory management , 1992, [1992] Proceedings The European Conference on Design Automation.
[2] Rudolf Eigenmann,et al. Automatic program parallelization , 1993, Proc. IEEE.
[3] Hugo De Man,et al. Architecture-driven synthesis techniques for VLSI implementation of DSP algorithms , 1990, Proc. IEEE.
[4] Frédéric Vivien,et al. A unified framework for schedule and storage optimization , 2001, PLDI '01.
[5] Henk Corporaal,et al. Dealing with data dependent conditions to enable general global source code transformations , 2009, Int. J. Embed. Syst..
[6] Rudy Lauwereins,et al. Energy-Aware Runtime Scheduling for Embedded-Multiprocessor SOCs , 2001, IEEE Des. Test Comput..
[7] Tarek S. Abdelrahman,et al. Fusion of Loops for Parallelism and Locality , 1997, IEEE Trans. Parallel Distributed Syst..
[8] Hugo De Man,et al. A preprocessing step for global loop transformations for data transfer optimization , 2000, CASES '00.
[9] Yves Robert,et al. Affine-by-Statement Scheduling of Uniform and Affine Loop Nests over Parametric , 1995, J. Parallel Distributed Comput..
[10] Sven Verdoolaege. Loop transformations for data transfer and storage optimization , 2002 .
[11] Erik Brockmeyer,et al. Data Access and Storage Management for Embedded Programmable Processors , 2002, Springer US.
[12] Cheng Wang,et al. Data locality enhancement by memory reduction , 2001, ICS '01.
[13] Michael F. P. O'Boyle,et al. Array recovery and high-level transformations for DSP applications , 2003, TECS.
[14] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.
[15] William Pugh,et al. The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[16] Constantine D. Polychronopoulos. Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design , 1988, IEEE Trans. Computers.
[17] Henk Corporaal,et al. Combining data and instruction memory energy optimizations for embedded applications , 2005, 3rd Workshop on Embedded Systems for Real-Time Multimedia, 2005..
[18] Sanjay V. Rajopadhye,et al. Generation of Efficient Nested Loops from Polyhedra , 2000, International Journal of Parallel Programming.
[19] Heiko Falk,et al. Control Flow Optimization by Loop Nest Splitting at the Source Code Level , 2002 .
[20] CatthoorFrancky,et al. Trade-offs in loop transformations , 2009 .
[21] William Pugh,et al. A practical algorithm for exact array dependence analysis , 1992, CACM.
[22] Wei Li,et al. Unifying data and control transformations for distributed shared-memory machines , 1995, PLDI '95.
[23] Anne Mignotte,et al. Loop alignment for memory accesses optimization , 1999, Proceedings 12th International Symposium on System Synthesis.
[24] Keshav Pingali,et al. A Singular Loop Transformation Framework Based on Non-Singular Matrices , 1992, LCPC.
[25] Per Gunnar Kjeldsberg. Storage Requirement Estimation and Optimization for Data Intensive Applications , 2001 .
[26] Chau-Wen Tseng,et al. Compiler optimizations for improving data locality , 1994, ASPLOS VI.
[27] Henk Corporaal,et al. Advanced copy propagation for arrays , 2003, LCTES '03.
[28] Liesbet Van der Perre,et al. Design of a Low Power Pre-synchronization ASIP for Multimode SDR Terminals , 2007, SAMOS.
[29] FeautrierPaul. Some efficient solutions to the affine scheduling problem , 1992 .
[30] Giovanni De Micheli,et al. SpC: synthesis of pointers in C: application of pointer analysis to the behavioral synthesis from C , 1998, ICCAD.
[31] Teresa H. Meng,et al. Portable video-on-demand in wireless communication , 1995, Proc. IEEE.
[32] Vivek Sarkar,et al. Optimization of array accesses by collective loop transformations , 1991, ICS '91.
[33] Rudy Lauwereins,et al. Data reuse exploration techniques for loop-dominated applications , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.
[34] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[35] Hai Zhou,et al. Parallel CAD: Algorithm Design and Programming Special Section Call for Papers TODAES: ACM Transactions on Design Automation of Electronic Systems , 2010 .
[36] Corinne Ancourt,et al. Automatic data mapping of signal processing applications , 1997, Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors.
[37] Ken Kennedy,et al. Automatic translation of FORTRAN programs to vector form , 1987, TOPL.
[38] Peter Marwedel,et al. Scratchpad memory: a design alternative for cache on-chip memory in embedded systems , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).
[39] Sharad Malik,et al. Flexible and formal modeling of microprocessors with application to retargetable simulation , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.
[40] Keshav Pingali,et al. A singular loop transformation framework based on non-singular matrices , 1992, International Journal of Parallel Programming.
[41] W. Pugh,et al. A framework for unifying reordering transformations , 1993 .
[42] Guang R. Gao,et al. Collective Analysis and Transformation of Loop Clusters , 1992 .
[43] Qubo Hu,et al. Hierarchical Memory Size Estimation for Loop Transformation and Data Memory Platform Optimization , 2007 .
[44] Gerda Janssens,et al. Multi-dimensional incremental loop fusion for data locality , 2003, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003.
[45] Erik Brockmeyer,et al. Layer assignment techniques for low energy in multi-layered memory organisations , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.
[46] Vincent Loechner,et al. Parametric Analysis of Polyhedral Iteration Spaces , 1998, J. VLSI Signal Process..
[47] G. De Micheli,et al. SpC: synthesis of pointers in C application of pointer analysis to the behavioral synthesis from C , 1998, 1998 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (IEEE Cat. No.98CB36287).
[48] Patrice Quinton,et al. The Alpha du Centaur experiment , 1992 .
[49] Mahmut T. Kandemir,et al. A Layout-Conscious Iteration Space Transformation Technique , 2001, IEEE Trans. Computers.
[50] Doran Wilde,et al. A LIBRARY FOR DOING POLYHEDRAL OPERATIONS , 2000 .
[51] Hugo De Man,et al. Memory Size Reduction Through Storage Order Optimization for Embedded Parallel Multimedia Applications , 1997, Parallel Comput..
[52] Heiko Falk,et al. Control Flow Driven Splitting of Loop Nests at the Source Code Level , 2003, DATE.
[53] Albert Cohen,et al. Putting Polyhedral Loop Transformations to Work , 2003, LCPC.
[54] Cédric Bastoul,et al. Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..