A stepwise approach to computing the multidimensional fast Fourier transform of large arrays

We consider the problem of performing a two-dimensional fast Fourier transform (FFT) on a very large matrix in limited core memory. We propose a decomposition of the Cooley-Tukey algorithm to allow efficient utilization of core memory and mass storage. The number of input/output operations is greatly reduced, with no increase in the computational burden. The method is suitable for nonsquare matrices and arrays of three or more dimensions.