Performance/energy trade-off in scientific computing: the case of ARM big.LITTLE and Intel Sandy Bridge

Power consumption is one of the main challenges to achieve Exascale performance. Current research trends aim at overcoming power consumption constraints using low-power processors. Although new processors feature sensors that enable precise power measurements, they provide different interfaces to collect data, making it difficult to correlate performance with energy consumption. To overcome this issue, the authors developed a platform-independent tool that collects power and energy data from homogeneous and heterogeneous systems. Using this tool, they provide a detailed comparison between a low-power processor (ARM big.LITTLE) and a high performance processor (Intel Sandy Bridge-EP) using all applications from the NAS parallel benchmarks and a real-world soil irrigation simulator. The results show that the average power demand of Intel Sandy Bridge-EP is within 12.6× to 152.4× higher than ARM big.LITTLE, whereas its average energy consumption is within 1.6× to 7.1× superior. Overall, ARM big.LITTLE presented a better performance/energy trade-off when it takes <9.2× the execution time of Intel Sandy Bridge-EP to solve the same problem.

[1]  Wu-chun Feng,et al.  The Green500 List: Year two , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[2]  Gregor von Laszewski,et al.  Efficient resource management for Cloud computing environments , 2010, International Conference on Green Computing.

[3]  Tao Tang,et al.  Power Measurements and Analyses of Massive Object Storage System , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[4]  Wu-chun Feng,et al.  Understanding Power Measurement Implications in the Green500 List , 2010, 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing.

[5]  Phillip Stanley-Marbell,et al.  Performance, Power, and Thermal Analysis of Low-Power Processors for Scale-Out Systems , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[6]  Antti Ylä-Jääski,et al.  Energy- and Cost-Efficiency Analysis of ARM-Based Clusters , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[7]  Jack J. Dongarra,et al.  Anatomy of a globally recursive embedded LINPACK benchmark , 2012, 2012 IEEE Conference on High Performance Extreme Computing.

[8]  P. O. A. Navaux,et al.  Time-to-Solution and Energy-to-Solution: A Comparison between ARM and Xeon , 2012, 2012 Third Workshop on Applications for Multi-Core Architecture.

[9]  Vanchinathan Venkataramani,et al.  Hierarchical power management for asymmetric multi-core in dark silicon era , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Vijay Janapa Reddi,et al.  High-performance and energy-efficient mobile web browsing on big/little systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).