Stencil Computations on AMD and Nvidia Graphics Processors: Performance and Tuning Strategies