A multi-tiered optimization framework for heterogeneous computing

Modern computing nodes often contain more than just a CPU. With the advent of GPU accelerators and Xeon Phi co-processors, there are many architectures available for data processing. However, it is difficult to understand which device is best for a given application. The issue of real-world performance originates in the lack of quantifiable data and method for analysis. This paper presents a novel, multi-tiered framework that leverages Pareto optimization to objectively construct the best processing node for a set of computational kernels. By deconstructing the optimization process into three distinct framework tiers (kernel, device, and system), the system designer is able to understand how the various computational variables impact device choices. We show how we leverage a combination of metrics and benchmarking to form various Pareto sets. Moving through the tiers, these Pareto sets are combined to identify the various combinations that enable maximum performance.

[1]  H. Lam,et al.  Performance Analysis of GPU Accelerators with Realizable Utilization of Computational Density , 2012, 2012 Symposium on Application Accelerators in High Performance Computing.

[2]  Alan D. George,et al.  Characterization of Fixed and Reconfigurable Multi-Core Devices for Application Acceleration , 2010, TRETS.

[3]  Jim Jeffers Intel® Xeon Phi™ Coprocessors , 2013 .

[4]  A. Gordon-Ross,et al.  A framework to analyze, compare, and optimize high-performance, on-board processing systems , 2012, 2012 IEEE Aerospace Conference.

[5]  Herman Lam,et al.  Comparative analysis of HPC and accelerator devices: Computation, memory, I/O, and power , 2010, 2010 FOURTH INTERNATIONAL WORKSHOP ON HIGH-PERFORMANCE RECONFIGURABLE COMPUTING TECHNOLOGY AND APPLICATIONS (HPRCTA).

[6]  Thomas A. Wettergren,et al.  Assessing Performance Tradeoffs in Undersea Distributed Sensor Networks , 2006, OCEANS 2006.

[7]  Travis S. Humble,et al.  Multi-FFT Vectorization for the Cell Multicore Processor , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[8]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..