The Effectiveness of Low-Precision Floating Arithmetic on Numerical Codes: A Case Study on Power Consumption

The low-precision floating point arithmetic that performs computation by reducing numerical accuracy with narrow bit-width is attracting since it can improve the performance of the numerical programs. Small memory footprint, faster computing speed, and energy saving are expected by performing calculation with low precision data. However, there have not been many studies on how low-precision arithmetics affects power and energy consumption of numerical codes. In this study, we investigate the power efficiency improvement by aggressively using low-precision arithmetics for HPC applications. In our evaluations, we analyze power characteristics of the Poisson's equation and the ground motion simulation programs with double precision and single precision floating point arithmetics. We confirm that energy efficiency improves by using low-precision arithmetics but it is heavily influenced by parameters such as data division and the number of OpenMP threads.

[1]  Tsuyoshi Ichimura,et al.  A Fast Scalable Implicit Solver with Concentrated Computation for Nonlinear Time-Evolution Problems on Low-Order Unstructured Finite Elements , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[2]  Sparsh Mittal,et al.  A Survey of Techniques for Approximate Computing , 2016, ACM Comput. Surv..

[3]  Yuichi Inadomi,et al.  Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[4]  Qiang Xu,et al.  ApproxMA: Approximate Memory Access for Dynamic Precision Scaling , 2015, ACM Great Lakes Symposium on VLSI.

[5]  Tsuyoshi Ichimura,et al.  Physics-Based Urban Earthquake Simulation Enhanced by 10.7 BlnDOF × 30 K Time-Step Unstructured FE Non-Linear Seismic Wave Simulation , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[6]  Laxmikant V. Kalé,et al.  Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power Budget , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[7]  Kengo Nakajima,et al.  Flat MPI vs. Hybrid: Evaluation of Parallel Programming Models for Preconditioned Iterative Solvers on “T2K Open Supercomputer” , 2009, 2009 International Conference on Parallel Processing Workshops.

[8]  Milos D. Ercegovac,et al.  The Art of Deception: Adaptive Precision Reduction for Area Efficient Physics Acceleration , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[9]  David Harris,et al.  An exponentiation unit for an OpenGL lighting engine , 2004, IEEE Transactions on Computers.