论文信息 - Optimizing Transformations of Stencil Operations for Parallel Object-Oriented Scientific Frameworks on Cache-Based Architectures

Optimizing Transformations of Stencil Operations for Parallel Object-Oriented Scientific Frameworks on Cache-Based Architectures

High-performance scientific computing relies increasingly on high-level, large-scale, object-oriented software frameworks to manage both algorithmic complexity and the complexities of parallelism: distributed data management, process management, inter-process communication, and load balancing. This encapsulation of data management, together with the prescribed semantics of a typical fundamental component of such object-oriented frameworks--a parallel or serial array class library--provides an opportunity for increasingly sophisticated compile-time optimization techniques. This paper describes two optimizing transformations suitable for certain classes of numerical algorithms, one for reducing the cost of inter-processor communication, and one for improving cache utilization; demonstrates and analyzes the resulting performance gains; and indicates how these transformations are being automated.

Kei Davis | Daniel J. Quinlan | Federico Bassetti

[1] R. Parsons,et al. A++/P++ array classes for architecture independent finite difference computations , 1994 .

[2] Daniel J. Quinlan,et al. Overture: An Object-Oriented Framework for Solving Partial Differential Equations , 1997, ISCOPE.

[3] Steven S. Muchnick,et al. Advanced Compiler Design and Implementation , 1997 .

[4] Kei Davis,et al. Toward Fortran 77 performance from object-oriented C++ scientific frameworks , 1998 .

[5] Bjarne Stroustrup,et al. C++ Programming Language , 1986, IEEE Softw..

[6] Daniel J. Quinlan,et al. OVERTURE: An Object-Oriented Software System for Solving Partial Differential Equations in Serial and Parallel Environments , 1997, PPSC.

[7] Todd L. Veldhuizen,et al. Expression templates , 1996 .