Nonsingular Data Transformations: Definition, Validity, and Applications

This paper describes a unifying framework for nonsingular data transformations. It shows that a wide class of existing transformations may be expressed in this framework, allowing compound transformations to be performed in one step. Validity conditions for such transformations are developed as is the form of the transformed program and data. Constructive algorithms to generate data transformations for different applications are described and applied to example programs. It is shown that they can have a significant impact on program performance and may be used in situations where traditional loop transformations are inappropriate.

[1]  Henry G. Dietz,et al.  Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation , 1991, LCPC.

[2]  John R. Gilbert,et al.  Automatic array alignment in data-parallel programs , 1993, POPL '93.

[3]  W. Jalby,et al.  To copy or not to copy: a compile-time technique for assessing when data copying should be used to eliminate cache conflicts , 1993, Supercomputing '93.

[4]  Keshav Pingali,et al.  Access normalization: loop restructuring for NUMA computers , 1993, TOCS.

[5]  Susan J. Eggers,et al.  Reducing false sharing on shared memory multiprocessors through compile time data transformations , 1995, PPOPP '95.

[6]  W. Pugh,et al.  A framework for unifying reordering transformations , 1993 .

[7]  Michael F. P. O'Boyle,et al.  Synchronization Minimization in a SPMD Execution Model , 1995, J. Parallel Distributed Comput..

[8]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[9]  Alexandru Nicolau,et al.  Advances in languages and compilers for parallel processing , 1991 .

[10]  Michael Stumm,et al.  CDA Loop Transformations , 1996 .

[11]  Jingke Li,et al.  Index domain alignment: minimizing cost of cross-referencing between distributed arrays , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.

[12]  Ken Kennedy,et al.  Unified compilation of Fortran 77D and 90D , 1993, LOPL.

[13]  Keshav Pingali,et al.  Solving Alignment Using Elementary Linear Algebra , 2001, Compiler Optimizations for Scalable Parallel Systems Languages.

[14]  Guy L. Steele,et al.  Data Optimization: Allocation of Arrays to Reduce Communication on SIMD Machines , 1990, J. Parallel Distributed Comput..

[15]  John Zahorjan,et al.  Optimizing Data Locality by Array Restructuring , 1995 .

[16]  Wei Li,et al.  Unifying data and control transformations for distributed shared-memory machines , 1995, PLDI '95.

[17]  M. O'Boyle,et al.  Data alignment: transformations to reduce communication on distributed memory architectures , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[18]  Monica S. Lam,et al.  Data and computation transformations for multiprocessors , 1995, PPOPP '95.

[19]  Monica S. Lam,et al.  A data locality optimizing algorithm , 1991, PLDI '91.

[20]  Dennis Gannon,et al.  Strategies for cache and local memory management by global program transformation , 1988, J. Parallel Distributed Comput..

[21]  Michael Stumm,et al.  Loop and Data Transformations: A Tutorial , 1993 .

[22]  Aart J. C. Bik Reshaping Access Patterns for Improving Data Locality , 1996 .