Simulations of PDE-based systems, such as flight vehicles, the global climate, petroleum reservoirs, semiconductor devices, and nuclear weapons, typically perform an order of magnitude or more below other scientific simulations (e.g., from chemistry and physics) with dense linear algebra or N-body kernels at their core. In this presentation, we briefly review the algorithmic structure of typical PDE solvers that is responsible for this situation and consider possible architectural and algorithmic sources for performance improvement. Some of these improvements are also applicable to other types of simulations, but we examine their consequences for PDEs: potential to exploit orders of magnitude more processor-memory units, better organization of the simulation for today's and likely near-future hierarchical memories, alternative formulations of the discrete systems to be solved, and new horizons in adaptivity. Each category is motivated by recent experiences in computational aerodynamics at the 1 Teraflop/s scale.
[1]
William Gropp,et al.
Parallel Newton-Krylov-Schwarz Algorithms for the Transonic Full Potential Equation
,
1996,
SIAM J. Sci. Comput..
[2]
Peter J. Denning,et al.
The working set model for program behavior
,
1968,
CACM.
[3]
D. Keyes.
How Scalable is Domain Decomposition in Practice
,
1998
.
[4]
John L. Gustafson,et al.
Reevaluating Amdahl's law
,
1988,
CACM.
[5]
D. P. Young,et al.
A locally refined rectangular grid finite element method: application to computational fluid dynamics and computational physics
,
1990
.
[6]
W. K. Anderson,et al.
Achieving High Sustained Performance in an Unstructured Mesh CFD Application
,
1999,
ACM/IEEE SC 1999 Conference (SC'99).