Applications of boundary element methods on the Intel Paragon

This paper describes three applications of the boundary element method and their implementations on the Intel Paragon supercomputer. Each of these applications sustains over 99 Gflops/s based on wall-clock time for the entire application and an actual count of flops executed; one application sustains over 140 Gflops/s. Each application accepts the description of an arbitrary geometry and computes the solution to a problem of commercial and research interest. The common kernel for these applications is a dense equation solver based on LU factorization. It is generally accepted that good performance can be achieved by dense matrix algorithms, but achieving the excellent performance demonstrated here required the development of a variety of special techniques to take full advantage of the power of the Intel Paragon.<<ETX>>