Multi-dimensional incremental loop fusion for data locality
暂无分享,去创建一个
[1] Corinne Ancourt,et al. Minimal Data Dependence Abstractions for Loop Transformations , 1994, LCPC.
[2] Gerda Janssens,et al. Feasibility of incremental translation , 2002 .
[3] Mark Horowitz,et al. Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.
[4] Frédéric Vivien,et al. Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling , 1997, Parallel Process. Lett..
[5] Frédéric Vivien,et al. Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs , 2004, International Journal of Parallel Programming.
[6] Alain Darte,et al. New results on array contraction [memory optimization] , 2002, Proceedings IEEE International Conference on Application- Specific Systems, Architectures, and Processors.
[7] Kristof Beyls,et al. Reuse Distance as a Metric for Cache Behavior. , 2001 .
[8] Guang R. Gao,et al. Collective Analysis and Transformation of Loop Clusters , 1992 .
[9] Cheng Wang,et al. Data locality enhancement by memory reduction , 2001, ICS '01.
[10] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.
[11] Sanjay V. Rajopadhye,et al. Generation of Efficient Nested Loops from Polyhedra , 2000, International Journal of Parallel Programming.
[12] Alain Darte. On the Complexity of Loop Fusion , 2000, Parallel Comput..
[13] H.J. De Man,et al. Automating High Level Control F'low Transformations For Dsp Memory Management , 1992, Workshop on VLSI Signal Processing.
[14] Tarek S. Abdelrahman,et al. Fusion of Loops for Parallelism and Locality , 1997, IEEE Trans. Parallel Distributed Syst..
[15] Anne Mignotte,et al. Loop alignment for memory accesses optimization , 1999, Proceedings 12th International Symposium on System Synthesis.
[16] Yves Robert,et al. Affine-by-Statement Scheduling of Uniform and Affine Loop Nests over Parametric , 1995, J. Parallel Distributed Comput..
[17] Alain Darte,et al. Complexity of Multi-dimensional Loop Alignment , 2002, STACS.
[18] Ken Kennedy,et al. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.
[19] Monica S. Lam,et al. Blocking and array contraction across arbitrarily nested loops using affine partitioning , 2001, PPoPP '01.
[20] Mahmut T. Kandemir,et al. A Layout-Conscious Iteration Space Transformation Technique , 2001, IEEE Trans. Computers.
[21] Teresa H. Y. Meng,et al. Design of a low power video decompression chip set for portable applications , 1996, J. VLSI Signal Process..
[22] Hugo De Man,et al. Memory Size Reduction Through Storage Order Optimization for Embedded Parallel Multimedia Applications , 1997, Parallel Comput..
[23] Monica S. Lam,et al. Maximizing parallelism and minimizing synchronization with affine transforms , 1997, POPL '97.
[24] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time , 1992, International Journal of Parallel Programming.
[25] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[26] Pierre Boulet,et al. Loop Parallelization Algorithms: From Parallelism Extraction to Code Generation , 1998, Parallel Comput..