论文信息 - A Case Study on Optimizing Accurate Half Precision Average

A Case Study on Optimizing Accurate Half Precision Average

In this work, we study the numerical performance of various common algorithms used to calculate the average of an array of half precision (FP16) floating point values. While the current generation of CPUs does not support native FP16 arithmetic, it is a planned feature in a number of next-generation CPUs. FP16 arithmetic was emulated via the half software library. Due to the limitations of the FP16 data type, some algorithms proved insufficient for arrays as small as 100 elements. We propose an algorithm that allows numerically stable FP16 computation of the average and compare it to the naive floating point (FP32) algorithm in terms of both numerical precision and runtime performance. We find that our algorithm offers comparable robustness, numerical precision, and SIMD performance to the higher precision computation.

Joël Falcou | Cecile Germain | Alan Kelly | Kenny Peou

[1] Sorin Lerner,et al. On Subnormal Floating Point and Abnormal Timing , 2015, 2015 IEEE Symposium on Security and Privacy.

[2] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.

[3] D. Etiemble,et al. 16-bit FP sub-word parallelism to facilitate compiler vectorization and improve performance of image and media processing , 2004 .

[4] William Kahan,et al. Pracniques: further remarks on reducing truncation errors , 1965, CACM.

[5] Jack M. Wolfe. Reducing truncation errors by programming , 1964, CACM.

[6] Eriko Nurvitadhi,et al. Accelerating Deep Convolutional Networks using low-precision and sparsity , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[8] Isaac Dooley,et al. Quantifying the Interference Caused by Subnormal Floating-Point Values , 2006 .

[9] David A. Padua,et al. An Evaluation of Vectorizing Compilers , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[10] Larry D. Hostetler,et al. The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.