Matrix Multiplication, Trilinear Decompositions, APA Algorithms, and Summation

Matrix multiplication (hereafter we use the acronym MM) is among the most fundamental operations of modern computations. The efficiency of its performance depends on various factors, in particular vectorization, data movement and arithmetic complexity of the computations, but here we focus just on the study of the arithmetic cost and the impact of this study on other areas of modern computing. In the early 1970s it was expected that the straightforward cubic time algorithm for MM will soon be accelerated to enable MM in nearly quadratic arithmetic time, with some far fetched implications. While pursuing this goal the mainstream research had its focus on the decrease of the classical exponent 3 of the complexity of MM towards its lower bound 2, disregarding the growth of the input size required to support this decrease. Eventually, surprising combinations of novel ideas and sophisticated techniques enabled the decrease of the exponent to its benchmark value of about 2.38, but the supporting MM algorithms improved the straightforward one only for the inputs of immense sizes. Meanwhile, the communication complexity, rather than the arithmetic complexity, has become the bottleneck of computations in linear algebra. This development may seem to undermine the value of the past and future research aimed at the decrease of the arithmetic cost of MM, but we feel that the study should be reassessed rather than closed and forgotten. We review the old and new work in this area in the present day context, recall some major techniques introduced in the study of MM, discuss their impact on the modern theory and practice of computations for MM and beyond MM, and link one of these techniques to some simple algorithms for inner product and summation.

[1]  K. Ramachandra,et al.  Vermeidung von Divisionen. , 1973 .

[2]  John E. Hopcroft,et al.  Duality Applied to the Complexity of Matrix Multiplication and Other Bilinear Forms , 1973, SIAM J. Comput..

[3]  S. Winograd Arithmetic complexity of computations , 1980 .

[4]  J. Hopcroft,et al.  Triangular Factorization and Inversion by Fast Matrix Multiplication , 1974 .

[5]  Victor Y. Pan,et al.  How to Multiply Matrices Faster , 1984, Lecture Notes in Computer Science.

[6]  V. Pan,et al.  Polynomial and Matrix Computations , 1994, Progress in Theoretical Computer Science.

[7]  Markus Bläser Lower bounds for the multiplicative complexity of matrix multiplication , 1999, computational complexity.

[8]  Haim Kaplan,et al.  Colored intersection searching via sparse rectangular matrix multiplication , 2006, SCG '06.

[9]  Dexter Kozen,et al.  The Design and Analysis of Algorithms , 1991, Texts and Monographs in Computer Science.

[10]  Hans F. de Groote On Varieties of Optimal Algorithms for the Computation of Bilinear Mappings. II. Optimal Algorithms for 2x2-Matrix Multiplication , 1978, Theor. Comput. Sci..

[11]  Abraham Waksman On Winograd's Algorithm for Inner Products , 1970, IEEE Transactions on Computers.

[12]  Éric Schost,et al.  Optimization techniques for small matrix multiplication , 2011, ACCA.

[13]  V. Pan How can we speed up matrix multiplication , 1984 .

[14]  Grazia Lotti,et al.  On the Asymptotic Complexity of Rectangular Matrix Multiplication , 1983, Theor. Comput. Sci..

[15]  Jean-Guillaume Dumas,et al.  Dense Linear Algebra over Word-Size Prime Fields: the FFLAS and FFPACK Packages , 2006, TOMS.

[16]  Victor Y. Pan,et al.  New Fast Algorithms for Matrix Operations , 1980, SIAM J. Comput..

[17]  Shmuel Winograd,et al.  On multiplication of 2 × 2 matrices , 1971 .

[18]  François Le Gall,et al.  Faster Algorithms for Rectangular Matrix Multiplication , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[19]  V. Pan New combinations of methods for the acceleration of matrix multiplications , 1981 .

[20]  Christopher Umans Group-theoretic algorithms for matrix multiplication , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[21]  Arnold Schönhage,et al.  Schnelle Multiplikation großer Zahlen , 1971, Computing.

[22]  Christopher Umans,et al.  A group-theoretic approach to fast matrix multiplication , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[23]  Shmuel Winograd,et al.  A New Algorithm for Inner Product , 1968, IEEE Transactions on Computers.

[24]  Igor E. Kaporin,et al.  The aggregation and cancellation techniques as a practical tool for faster matrix multiplication , 2004, Theor. Comput. Sci..

[25]  Christopher Umans,et al.  Fast matrix multiplication using coherent configurations , 2012, SODA.

[26]  Charles M. Fiduccia On Obtaining Upper Bounds on the Complexity of Matrix Multiplication , 1972, Complexity of Computer Computations.

[27]  Piotr Sankowski,et al.  Fast Dynamic Transitive Closure with Lookahead , 2010, Algorithmica.

[28]  V. Strassen Gaussian elimination is not optimal , 1969 .

[29]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[30]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[31]  Igor E. Kaporin,et al.  A practical algorithm for faster matrix multiplication , 1999, Numerical Linear Algebra with Applications.

[32]  M. Fischer,et al.  STRING-MATCHING AND OTHER PRODUCTS , 1974 .

[33]  Martin Fürer Faster integer multiplication , 2007, STOC '07.

[34]  Michael A. Heroux,et al.  GEMMW: A Portable Level 3 BLAS Winograd Variant of Strassen's Matrix-Matrix Multiply Algorithm , 1994, Journal of Computational Physics.

[35]  Victor Y. Pan Better Late Than Never: Filling a Void in the History of Fast Matrix Multiplication and Tensor Decompositions , 2014, ArXiv.

[36]  Bruce W. Char,et al.  GCDHEU: Heuristic Polynomial GCD Algorithm Based on Integer GCD Computation , 1984, EUROSAM.

[37]  Uri Zwick,et al.  All pairs shortest paths using bridging sets and rectangular matrix multiplication , 2000, JACM.

[38]  Giuseppe F. Italiano,et al.  Fully dynamic transitive closure: breaking through the O(n/sup 2/) barrier , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[39]  Patrick C. Fischer Further Schemes for Combining Matrix Algorithms , 1974, ICALP.

[40]  Jean-Guillaume Dumas,et al.  Memory efficient scheduling of Strassen-Winograd's matrix multiplication algorithm , 2007, ISSAC '09.

[41]  S. Winograd On the number of multiplications necessary to compute certain functions , 1970 .

[42]  Ran Raz,et al.  Lower bounds for matrix product, in bounded depth circuits with arbitrary gates , 2001, STOC '01.

[43]  Volker Strassen,et al.  The asymptotic spectrum of tensors and the exponent of matrix multiplication , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[44]  D. Coppersmiths RAPID MULTIPLICATION OF RECTANGULAR MATRICES * , 2014 .

[45]  Eugene E. Tyrtyshnikov,et al.  Algebraic Wavelet Transform via Quantics Tensor Train Decomposition , 2011, SIAM J. Sci. Comput..

[46]  Hans F. de Groote On Varieties of Optimal Algorithms for the Computation of Bilinear Mappings I. The Isotropy Group of a Bilinear Mapping , 1978, Theor. Comput. Sci..

[47]  Daniel Kressner,et al.  A literature survey of low‐rank tensor approximation techniques , 2013, 1302.7121.

[48]  Noga Alon,et al.  On sunflowers and matrix multiplication , 2012, 2012 IEEE 27th Conference on Computational Complexity.

[49]  V. Pan,et al.  Polynomial and matrix computations (vol. 1): fundamental algorithms , 1994 .

[50]  James Demmel,et al.  Fast linear algebra is stable , 2006, Numerische Mathematik.

[51]  John E. Hopcroft,et al.  Some Techniques for Proving Certain Simple Programs Optimal , 1969, SWAT.

[52]  Michael Clausen,et al.  Algebraic complexity theory , 1997, Grundlehren der mathematischen Wissenschaften.

[53]  Victor Y. Pan,et al.  Field extension and trilinear aggregating, uniting and canceling for the acceleration of matrix multiplications , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[54]  Grazia Lotti,et al.  O(n2.7799) Complexity for n*n Approximate Matrix Multiplication , 1979, Inf. Process. Lett..

[55]  Victor Y. Pan,et al.  Fast Rectangular Matrix Multiplication and Applications , 1998, J. Complex..

[56]  A. Davie,et al.  Improved bound for complexity of matrix multiplication , 2013, Proceedings of the Royal Society of Edinburgh: Section A Mathematics.

[57]  J. M. Landsberg,et al.  New Lower Bounds for the Rank of Matrix Multiplication , 2012, SIAM J. Comput..

[58]  Charles M. Fiduccia,et al.  Polynomial evaluation via the division algorithm the fast Fourier transform revisited , 1972, STOC.

[59]  Don Coppersmith,et al.  On the Asymptotic Complexity of Matrix Multiplication , 1982, SIAM J. Comput..

[60]  A. Bultheel Polynomial and matrix computations. volume 1:Fundamental algorithms : Dario Bini and Victor Pan Progress in Theoretical Computer Science, Birkhäuser, 1994, xvi + 415 pages , 1994 .

[61]  Ivan Oseledets,et al.  Approximation of matrices with logarithmic number of parameters , 2009 .

[62]  Victor Y. Pan,et al.  Strassen's algorithm is not optimal trilinear technique of aggregating, uniting and canceling for constructing fast algorithms for matrix operations , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[63]  Nicholas J. Higham,et al.  Exploiting fast matrix multiplication within the level 3 BLAS , 1990, TOMS.

[64]  Bruce W. Char,et al.  GCDHEU: Heuristic Polynomial GCD Algorithm Based On Integer GCD Computation , 1984, J. Symb. Comput..

[65]  David P. Dobkin,et al.  On the optimal evaluation of a set of bilinear forms , 1973, SWAT.

[66]  V. Pan METHODS OF COMPUTING VALUES OF POLYNOMIALS , 1966 .

[67]  Virginia Vassilevska Williams,et al.  Multiplying matrices faster than coppersmith-winograd , 2012, STOC '12.

[68]  M. D. MacLaren The Art of Computer Programming. Volume 2: Seminumerical Algorithms (Donald E. Knuth) , 1970 .

[69]  Ivan V. Oseledets,et al.  Approximation of 2d˟2d Matrices Using Tensor Decomposition , 2010, SIAM J. Matrix Anal. Appl..

[70]  Amir Shpilka,et al.  Lower bounds for matrix product , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[71]  Don Coppersmith,et al.  Rectangular Matrix Multiplication Revisited , 1997, J. Complex..

[72]  Arnold Schönhage,et al.  Partial and Total Matrix Multiplication , 1981, SIAM J. Comput..

[73]  Allan Borodin,et al.  The computational complexity of algebraic and numeric problems , 1975, Elsevier computer science library.

[74]  David P. Dobkin,et al.  On the Number of Multiplications Required for Matrix Multiplication , 1976, SIAM J. Comput..

[75]  E. Tyrtyshnikov,et al.  TT-cross approximation for multidimensional arrays , 2010 .

[76]  Alexandru Nicolau,et al.  Adaptive Winograd's matrix multiplications , 2009, TOMS.

[77]  Robert Probert On The Complexity Of Symmetric Computations , 1974 .

[78]  Volker Strassen,et al.  Algebraic Complexity Theory , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[79]  Donald Ervin Knuth,et al.  The Art of Computer Programming, Volume II: Seminumerical Algorithms , 1970 .

[80]  Julian D. Laderman,et al.  On practical algorithms for accelerated matrix multiplication , 1992 .

[81]  James Demmel,et al.  Communication lower bounds and optimal algorithms for numerical linear algebra*† , 2014, Acta Numerica.

[82]  Andris Ambainis,et al.  Fast Matrix Multiplication: Limitations of the Coppersmith-Winograd Method , 2014, STOC.

[83]  E. Tyrtyshnikov Tensor approximations of matrices generated by asymptotically smooth functions , 2003 .

[84]  Éric Schost,et al.  Optimization techniques for small matrix multiplication , 2011, Theor. Comput. Sci..

[85]  Jeffrey D. Smith,et al.  Design and Analysis of Algorithms , 2009, Lecture Notes in Computer Science.

[86]  Robert L. Probert On the Additive Complexity of Matrix Multiplication , 1976, SIAM J. Comput..

[87]  A. J. Stothers On the complexity of matrix multiplication , 2010 .

[88]  V. Pan,et al.  Fast rectangular matrix multiplication and some applications , 2008 .

[89]  L. R. Kerr,et al.  On Minimizing the Number of Multiplications Necessary for Matrix Multiplication , 1969 .

[90]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[91]  Raphael Yuster,et al.  Fast sparse matrix multiplication , 2004, TALG.

[92]  R. Gregory Taylor,et al.  Modern computer algebra , 2002, SIGA.

[93]  David H. Bailey,et al.  Extra high speed matrix multiplication on the Cray-2 , 1988 .

[94]  A. Smirnov,et al.  The bilinear complexity and practical algorithms for matrix multiplication , 2013 .

[95]  V. Pan,et al.  Trilinear aggregating with implicit canceling for a new acceleration of matrix multiplication , 1982 .

[96]  François Le Gall,et al.  Powers of tensors and fast matrix multiplication , 2014, ISSAC.

[97]  Dario Bini Relations between exact and approximate bilinear algorithms. Applications , 1980 .

[98]  Rasmus Pagh,et al.  Faster join-projects and sparse matrix multiplications , 2009, ICDT '09.

[99]  Arnold Schönhage,et al.  Asymptotically Fast Algorithms for the Numerical Multiplication and Division of Polynomials with Complex Coeficients , 1982, EUROCAM.

[100]  Victor Y. Pan,et al.  Polynomial division and its computational complexity , 1986, J. Complex..

[101]  Victor Y. Pan The Technique of Trilinear Aggregating and the Recent Progress in the Asymptotic Acceleration of Matrix Operations , 1984, Theor. Comput. Sci..

[102]  Nader H. Bshouty A Lower Bound for Matrix Multiplication , 1989, SIAM J. Comput..

[103]  Nader H. Bshouty,et al.  On the Additive Complexity of 2 x 2 Matrix Multiplication , 1995, Inf. Process. Lett..

[104]  Dario Bini,et al.  Stability of fast algorithms for matrix multiplication , 1980 .

[105]  Raphael Yuster,et al.  Detecting short directed cycles using rectangular matrix multiplication and dynamic programming , 2004, SODA '04.

[106]  John Todd,et al.  Motivation for working in numerical analysis , 1954 .

[107]  Francesco Romani,et al.  Some Properties of Disjoint Sums of Tensors Related to Matrix Multiplication , 1982, SIAM J. Comput..

[108]  Jean-Guillaume Dumas,et al.  FFPACK: finite field linear algebra package , 2004, ISSAC '04.

[109]  S Winograd,et al.  On the number of multiplications required to compute certain functions. , 1967, Proceedings of the National Academy of Sciences of the United States of America.