Fast Fourier Transform Accelerated Fast Multipole Algorithm
暂无分享,去创建一个
This paper describes an ${\cal O}(p^2 \log_2(p) N)$ implementation of the fast multipole algorithm (FMA) for $N$-body simulations. This method of computing the FMA is faster than the original, which is ${\cal O}(p^4N)$, where $p$ is the number of terms retained in the truncated multipole expansion representation of the potential field of a collection of charged particles. The $p$ term determines the accuracy of the calculation. The limiting ${\cal O}(p^4)$ computation in the original FMA is a convolution-like operation on a matrix of multipole coefficients. This paper describes the implementation details of a conversion of this limiting computation to linear convolution, which is then computed in the Fourier domain using the fast Fourier transform (FFT), based on a method originally outlined by Greengard and Rokhlin. In addition, this paper describes a new block decomposition of the multipole expansion data that provides numerical stability and efficient computation. The resulting ${\cal O}(p^2 \log_2(p))$ subroutine has a speedup of 2 on a sequential processor over the original method for $p=8$, and a speedup of 4 for $p=16$. The new subroutine vectorizes well and has a speedup of 3 on a vector processor at $p=8$ and a speedup of 6 at $p=16$.
[1] Feng Zhao,et al. The Parallel Multipole Method on the Connection Machine , 1991, SIAM J. Sci. Comput..
[2] K. Schmidt,et al. Implementing the fast multipole method in three dimensions , 1991 .
[3] Leslie Greengard,et al. A fast algorithm for particle simulations , 1987 .