Debugging and Optimization of HPC Programs with the Verrou Tool

The analysis of Floating-Point-related issues in HPC codes is becoming a topic of major interest: parallel computing and code optimization often break the reproducibility of numerical results across machines, compilers and even executions of the same program. This paper presents how the Verrou tool can help during all stages of the Floating-Point analysis of HPC codes: diagnose, debugging and optimization. Recent developments of Verrou are presented, along with examples illustrating the interest of these new features for industrial codes such as code aster. More specifically, the Verrou arithmetic back-ends now allow analyzing or emulating mixed-precision programs. Interlibm, an interposition layer for the mathematical library, is introduced to mitigate long-standing issues with algorithms from the libm. Finally, debugging algorithms are extended in order to produce useful information as soon as it is available. All these features are available in released version 2.1.0 and upcoming version 2.2.0.

[1]  Séthy Montan,et al.  Sur la validation numérique des codes de calcul industriels. (On the numerical verification of industrial codes) , 2013 .

[2]  Sebastian Hack,et al.  A dynamic program analysis to find floating-point accuracy problems , 2012, PLDI.

[3]  Fabienne Jézéquel,et al.  CADNA_C: A version of CADNA for use with C or C++ programs , 2010, Comput. Phys. Commun..

[4]  Michael O. Lam,et al.  Fine-grained floating-point precision analysis , 2018, Int. J. High Perform. Comput. Appl..

[5]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[6]  Philip Heng Wai Leong,et al.  MCALIB: Measuring Sensitivity to Rounding Error with Monte Carlo Programming , 2015, TOPL.

[7]  Shih-Lien Lu,et al.  Recycled Error Bits: Energy-Efficient Architectural Support for Floating Point Accuracy , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[8]  James Demmel,et al.  Precimonious: Tuning assistant for floating-point precision , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[9]  François Févotte,et al.  VERROU: Assessing Floating-Point Accuracy Without Recompiling , 2016 .

[10]  D. Stott Parker,et al.  Monte Carlo Arithmetic: exploiting randomness in floating-point arithmetic , 1997 .

[11]  Richard Levins,et al.  Why Programs Fail , 2010 .

[12]  Eric Petit,et al.  Verificarlo: Checking Floating Point Accuracy through Monte Carlo Arithmetic , 2015, 2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH).

[13]  François Févotte,et al.  Studying the Numerical Quality of an Industrial Computing Code: A Case Study on Code_aster , 2017, NSV@CAV.

[14]  Florent de Dinechin,et al.  Certifying the Floating-Point Implementation of an Elementary Function Using Gappa , 2011, IEEE Transactions on Computers.

[15]  Debugging and Optimization of HPC Programs in Mixed Precision with the Verrou Tool , 2018 .

[16]  Stanley Bak,et al.  Simulation-Equivalent Reachability of Large Linear Systems with Inputs , 2017, CAV.

[17]  Peter D. Düben,et al.  rpe v5: an emulator for reduced floating-point precision in large numerical simulations , 2016 .

[18]  François Févotte,et al.  Confidence Intervals for Stochastic Arithmetic , 2018, ACM Trans. Math. Softw..

[19]  Nathalie Revol Introduction to the IEEE 1788-2015 Standard for Interval Arithmetic , 2017, NSV@CAV.

[20]  Sylvie Putot,et al.  A Reduced Product of Absolute and Relative Error Bounds for Floating-Point Analysis , 2018, SAS.

[21]  Fabienne Jézéquel,et al.  A new version of the CADNA library for estimating round-off error propagation in Fortran programs , 2010, Comput. Phys. Commun..

[22]  Laurent Plagne,et al.  Portable vectorization and parallelization of C++ multi-dimensional array computations , 2017, ARRAY@PLDI.

[23]  David Defour,et al.  VeriTracer: Context-enriched tracer for floating-point arithmetic analysis , 2018, 2018 IEEE 25th Symposium on Computer Arithmetic (ARITH).

[24]  Jean Vignes,et al.  A stochastic arithmetic for reliable scientific computation , 1993 .

[25]  François Févotte,et al.  Auto-tuning for floating-point precision with Discrete Stochastic Arithmetic , 2019, J. Comput. Sci..