Energy Ef fi ciency Evaluation of Multi-level Parallelism on Low Power Processors

Energy efficiency and consumption are becoming major concerns in HPC area. One considered alternative to reach better energy efficiency has been the use of unconventional architectures in the HPC scenario, e.g., embedded and mobile processors. In this paper, we present an evaluation about the use of multi-level parallelism in two low-power architectures: Intel Atom and ARM Cortex-A9. Our results show that for all tested cases Intel Atom outperforms ARM Cortex-A9 in terms of execution time and Energy-Delay Product.

[1]  Pascal Bouvry,et al.  Performance Evaluation and Energy Efficiency of High-Density HPC Platforms Based on Intel, AMD and ARM Processors , 2013, EE-LSDS.

[2]  Luiz André Barroso,et al.  The Price of Performance , 2005, ACM Queue.

[3]  P. O. A. Navaux,et al.  Time-to-Solution and Energy-to-Solution: A Comparison between ARM and Xeon , 2012, 2012 Third Workshop on Applications for Multi-Core Architecture.

[4]  J. Shalf,et al.  A Real Cloud Computer , 2009, IEEE Spectrum.

[5]  Michael A. Frumkin,et al.  Performance and scalability of the NAS parallel benchmarks in Java , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[6]  Alejandro Rico,et al.  Tibidabo: Making the case for an ARM-based HPC system , 2014, Future Gener. Comput. Syst..

[7]  Michael Frumkin,et al.  Implementation of NAS Parallel Benchmarks in High Performance Fortran , 2000 .

[8]  James Demmel,et al.  the Parallel Computing Landscape , 2022 .

[9]  Michael Frumkin,et al.  The OpenMP Implementation of NAS Parallel Benchmarks and its Performance , 2013 .

[10]  R. Vanderwijngaart,et al.  NAS Parallel Benchmarks, Multi-Zone Versions , 2003 .

[11]  Phillip Stanley-Marbell,et al.  Performance, Power, and Thermal Analysis of Low-Power Processors for Scale-Out Systems , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[12]  Maurice Yarrow,et al.  New Implementations and Results for the NAS Parallel Benchmarks 2 , 1997, PPSC.

[13]  Haoqiang Jin,et al.  Performance Characteristics of the Multi-Zone NAS Parallel Benchmarks , 2004, IPDPS.

[14]  N. Muralimanohar,et al.  CACTI 6 . 0 : A Tool to Understand Large Caches , 2007 .

[15]  Alex Ramírez,et al.  The low-power architecture approach towards exascale computing , 2011, ScalA '11.

[16]  Alejandro Rico,et al.  Experiences with mobile processors for energy efficient HPC , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[17]  Karthikeyan Sankaralingam,et al.  Power struggles: Revisiting the RISC vs. CISC debate on contemporary ARM and x86 architectures , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[18]  Antti Ylä-Jääski,et al.  Energy- and Cost-Efficiency Analysis of ARM-Based Clusters , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).