Increasing data locality and introducing Level-3 BLAS in the Neville elimination

Abstract In this paper we present two new algorithmic variants to compute the Neville elimination, with and without pivoting, which improve data locality and cast most of the computations in terms of high-performance Level 3 BLAS. The experimental evaluation on a state-of-the-art multi-core processor demonstrates that the new blocked algorithms exhibit a much higher degree of concurrency and better cache usage, yielding higher performance while offering numerical accuracy akin to that of the traditional columnwise variant in most cases.

[1]  Charles A. Micchelli,et al.  Total positivity and its applications , 1996 .

[2]  Juan R. Torregrosa,et al.  A Totally Positive Factorization of Rectangular Matrices by the Neville Elimination , 2004, SIAM J. Matrix Anal. Appl..

[3]  Bruno Lang,et al.  Using Level 3 BLAS in Rotation-Based Algorithms , 1998, SIAM J. Sci. Comput..

[4]  Alexander Tiskin Communication-efficient parallel generic pairwise elimination , 2007, Future Gener. Comput. Syst..

[5]  James Demmel,et al.  The Accurate and Efficient Solution of a Totally Positive Generalized Vandermonde Linear System , 2005, SIAM J. Matrix Anal. Appl..

[6]  Juan Manuel Peña,et al.  Backward error analysis of Neville elimination , 1997 .

[7]  L. Trefethen,et al.  Average-case stability of Gaussian elimination , 1990 .

[8]  Pedro Alonso,et al.  Growth factors of pivoting strategies associated with Neville elimination , 2011, J. Comput. Appl. Math..

[9]  José Ranilla,et al.  Neville elimination on multi- and many-core systems: OpenMP, MPI and CUDA , 2011, The Journal of Supercomputing.

[10]  José Ranilla,et al.  Blocking Neville elimination algorithm for exploiting cache memories , 2009, Appl. Math. Comput..

[11]  Juan Manuel Peña,et al.  Total positivity and Neville elimination , 1992 .

[12]  Pedro Alonso,et al.  A collection of examples where Neville elimination outperforms Gaussian elimination , 2010, Appl. Math. Comput..

[13]  Gene H. Golub,et al.  Matrix computations , 1983 .

[14]  B. Kågström,et al.  Blocked algorithms for the reduction to Hessenberg-triangular form revisited , 2008 .

[15]  Danny C. Sorensen,et al.  Analysis of Pairwise Pivoting in Gaussian Elimination , 1985, IEEE Transactions on Computers.