Reverse Time Migration with Manycore Coprocessors

In this work we share our experiences and results for an implementation of RTM (Reverse Time Migration) that harness the compute power of CPUs and coprocessors cooperatively. The proposed implementation explores a unified programming model and unified communication layer in a hybrid system composed of CPUs and MIC (Many Integrated Cores) coprocessors. We demonstrate that using the same C source code and the MPI OMP programming model we easily improve the throughput of our RTM algorithm and slightly increase the performance per watts. This proposed node configuration also frees memory in the 2-socket host for RTM formulations that might require saving snapshots for cross-correlation and any other auxiliary arrays between iterations of the algorithm. Extension to TTI RTM is straightforward as well as full waveform inversion since the same stencil optimization applied as well as domain decomposition and snapshots I/O schemes.