Preliminary Investigation of Mobile System Features Potentially Relevant to HPC

Energy consumption's increasing importance in scientific computing has driven an interest in developing energy efficient high performance systems. Energy constraints of mobile computing has motivated the design and evolution of low-power computing systems capable of supporting a variety of compute-intensive user interfaces and applications. Others have observed the evolution of mobile devices to also provide high performance [14]. Their work has primarily examined the performance and efficiency of compute-intensive scientific programs executed either on mobile systems or hybrids of mobile CPUs grafted into non-mobile (sometimes HPC) systems [6, 12, 14].This report describes an investigation of performance and energy consumption of a single scientific code on five high performance and mobile systems with the objective of identifying the performance and energy efficiency implications of a variety of architectural features. The results of this pilot study suggest that ISA is less significant than other specific aspects of system architecture in achieving high performance at high efficiency. The strategy employed in this study may be extended to other scientific applications with a variety of memory access, computation, and communication properties.

[1]  Ozalp Babaoglu,et al.  ACM Transactions on Computer Systems , 2007 .

[2]  Venkatesh Pallipadi,et al.  The Ondemand Governor Past, Present, and Future , 2010 .

[3]  Per Hammarlund,et al.  4th generation Intel™ Core processor, codenamed Haswell , 2013, 2013 IEEE Hot Chips 25 Symposium (HCS).

[4]  Karthikeyan Sankaralingam,et al.  ISA Wars , 2015, ACM Trans. Comput. Syst..

[5]  Wu-chun Feng,et al.  Towards efficient supercomputing: a quest for the right metric , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[6]  Shirley Moore,et al.  Non-determinism and overcount on modern hardware performance counter implementations , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[7]  Carl Staelin,et al.  lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.

[8]  Matthias Hauswirth,et al.  Accuracy of performance counter measurements , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[9]  Vladimir V. Stegailov,et al.  Floating-point performance of ARM cores and their efficiency in classical molecular dynamics , 2016 .

[10]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[11]  Karthikeyan Sankaralingam,et al.  Power struggles: Revisiting the RISC vs. CISC debate on contemporary ARM and x86 architectures , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[12]  Mark S. Gordon,et al.  Performance and energy efficiency analysis of 64-bit ARM using GAMESS , 2015, Co-HPC@SC.

[13]  Ananta Tiwari,et al.  Compute bottlenecks on the new 64-bit ARM , 2015, E2SC '15.

[14]  Alejandro Rico,et al.  Tibidabo: Making the case for an ARM-based HPC system , 2014, Future Gener. Comput. Syst..

[15]  Boyana Norris,et al.  A Roofline Visualization Framework , 2015, ArXiv.

[16]  Mahmut T. Kandemir,et al.  Leakage Current: Moore's Law Meets Static Power , 2003, Computer.