A Scalable Server Architecture for Next-Generation Heterogeneous Compute Clusters

Increasing the energy efficiency of today's high-performance computing systems requires new approaches that go beyond homogeneous architectures, which primarily target maximum performance per node. Heterogeneous architectures that can be tailored towards the specific needs of a particular application are a promising alternative to state-of-the-art server systems. In this paper, we present a novel highly-scalable server architecture that seamlessly integrates variable combinations of general purpose CPUs, embedded CPUs, FPGAs, and GPUs. Embedded CPUs based on the latest ARM Cortex-A15 devices with integrated embedded GPUs are combined with FPGA-based reconfigurable SoCs, which can be used for application-specific hardware acceleration. A dedicated monitoring network enables continuous control and fine-grained observation of all relevant system parameters. Communication between the compute nodes is established by a flexible multi-level interconnect that can be adapted to various Ethernet and Infiniband standards. The communication facilities are further enhanced by direct high-bandwidth, low-latency links between the embedded FPGA-based reconfigurable SoCs.

[1]  Ali Heydari,et al.  High-efficiency server design , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[2]  Alex Ramírez,et al.  The low-power architecture approach towards exascale computing , 2011, ScalA '11.

[3]  Philippe Olivier Alexandre Navaux,et al.  Evaluating Performance and Energy on ARM-based Clusters for High Performance Computing , 2012, 2012 41st International Conference on Parallel Processing Workshops.

[4]  Jian Li,et al.  TAPO: Thermal-aware power optimization techniques for servers and data centers , 2011, 2011 International Green Computing Conference and Workshops.

[5]  Collin McCurdy,et al.  The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.

[6]  Karthick Rajamani,et al.  Energy Management for Commercial Servers , 2003, Computer.

[7]  Francieli Zanon Boito,et al.  Evaluating application performance and energy consumption on hybrid CPU+GPU architecture , 2012, Cluster Computing.