Hardware-independent application characterization

The trend in high-performance computing is to include computational accelerators such as GPUs or Xeon Phis in each node of a large-scale system. Qualitatively, such accelerators tend to favor codes that perform large numbers of floating-point and integer operations per branch; that exhibit high degrees of memory locality; and that are highly data-parallel. The question we address in this work is how to quantify those characteristics. To that end we developed an application-characterization tool called Byfl that provides a set of “software performance counters”. These are analogous to the hardware performance counters provided by most modern processors but are implemented via code instrumentation-the equivalent of adding flops = flops + 1 after every floating-point operation but in fact implemented by modifying the compiler's internal representation of the code.