Automatic dynamic stack management in large scientific applications: A case study using a global spectral model

Compute and data intensive scientific applications demand compilers to allocate more temporaries on the stack. For example, the change resolution component of a global spectral model changes the resolution of the input files using Nearest Neighbour Interpolation which requires large temporaries on the stack. Temporaries include sub-arrays, automatic arrays and, sub-sections corresponding to actual arguments of a subroutine. If the infrastructure cannot provide adequate stack space at runtime relative to the total size of the temporaries, then the application program runs out of stack and aborts. Allocating the heap memory to store the large temporaries introduced around 25% additional runtime because of allocation and deallocation of the memory. This is observed in various components of a global spectral model. We propose an automatic dynamic stack management framework which uses application profile information and the information related to the required stack memory. It does not mandate any hardware configuration changes. This technique manages stack frames on RAM by the compiler-inserted code into the application binary. Our experiments with a global spectral model show that we are able to obtain an average runtime savings of 21% along with a compile time overhead of 4%. The actual gain depends on the size of the temporaries in an application and the size of the RAM. Currently, it supports sequential and OpenMP applications. We further enhance our framework to deal with the complex MPI and GPU programming paradigms.

[1]  John Shalf,et al.  The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..