Techniques for the Interactive Development of Numerical Linear Algebra Libraries for Scientific Computation

The development of high-performance numerical algorithms and their effective use in application codes is an iterative process involving the refinement of the algorithms and their implementations that continues during the lifetime of the algorithm. Knowledge and expertise from the areas of numerical analysis, computer software, compilers, machine architecture, and applications are required during the development. To improve this process, the FALCON environment was developed to combine the analysis techniques from restructuring compilers with the algebraic techniques from numerical analysis. In this thesis, interactive techniques that were developed to extend the FALCON environment are described. These techniques allow the developer to improve the analysis of the algorithm, to restructure the algorithm using transformation patterns, to utilize additional information about structures within the data, and to control the generation of the target code. The experimental results show that the codes generated by the interactive techniques have better performance than those generated automatically. In addition, the environment was extended to support the generation of C++ code. When the C++ code generated by FALCON is compared to the code generated by other MATLAB translators, the C++ code is typically faster. However, when compared against the Fortran~90 code generated by FALCON, the C++ code is usually slower.

[1]  Zahari Zlatev,et al.  Y12M - Solution of Large and Sparse Systems of Linear Algebraic Equations , 1981, Lecture Notes in Computer Science.

[2]  K. A. Gallivan,et al.  Parallel Algorithms for Dense Linear Algebra Computations , 1990, SIAM Rev..

[3]  Alejandro L. Garcia Numerical methods for physics , 1994 .

[4]  Stijn Bijnens,et al.  Object parallelism in XENOOPS , 1993 .

[5]  Terence Parr An Overview of SORCERER: A Simple Tree-Parser Generator , 1994 .

[6]  Lawrence S. Kroll Mathematica--A System for Doing Mathematics by Computer. , 1989 .

[7]  Bo Kågström,et al.  Algorithm Development for Distributed Memory Multicomputers Using CONLAB , 1992, Sci. Program..

[8]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[9]  Aart J. C. Bik,et al.  Nonzero structure analysis , 1994, ICS '94.

[10]  Rudolf Eigenmann,et al.  Idiom recognition in the Polaris parallelizing compiler , 1995, ICS '95.

[11]  Jack J. Dongarra,et al.  Matrix Eigensystem Routines - EISPACK Guide, Second Edition , 1976, Lecture Notes in Computer Science.

[12]  John K. Reid,et al.  Some Design Features of a Sparse Matrix Code , 1979, TOMS.

[13]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[14]  David W. Walker,et al.  The Design of a Standard Message Passing Interface for Distributed Memory Concurrent Computers , 1994, Parallel Comput..

[15]  Richard Barrett,et al.  Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods , 1994, Other Titles in Applied Mathematics.

[16]  Constantine D. Polychronopoulos,et al.  Symbolic analysis for parallelizing compilers , 1996, TOPL.

[17]  Jack Dongarra,et al.  An object oriented design for high performance linear algebra on distributed memory architectures , 1993 .

[18]  John R. Rice,et al.  //ELLPACK: a numerical simulation programming environment for parallel MIMD machines , 1990, ICS '90.

[19]  Luiz A. DeRose,et al.  Compiler techniques for MATLAB programs , 1996 .

[20]  John H. Mathews,et al.  Numerical Methods For Mathematics, Science, and Engineering , 1987 .

[21]  D. W. Walker,et al.  LAPACK++: a design overview of object-oriented extensions for high performance linear algebra , 1993, Supercomputing '93.

[22]  James R. Cordy,et al.  The TXL Programming Language Syntax and Informal Semantics , 1993 .

[23]  Youcef Saad,et al.  A Basic Tool Kit for Sparse Matrix Computations , 1990 .

[24]  Ramesh C. Agarwal,et al.  A high performance algorithm using pre-processing for the sparse matrix-vector multiplication , 1992, Proceedings Supercomputing '92.

[25]  Jacob T. Schwartz,et al.  Automatic data structure choice in a language of very high level , 1975, CACM.

[26]  Yoichi Muraoka,et al.  On the time required for a sequence of matrix products , 1973, CACM.

[27]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[28]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[29]  John R. Gilbert,et al.  Optimal Expression Evaluation for Data Parallel Architectures , 1991, J. Parallel Distributed Comput..

[30]  Steve W. Otto Parallel Array Classes and Lightweight Sharing Mechanisms , 1993, Sci. Program..

[31]  Bo Kågström,et al.  A CONLAB Compiler for a Distributed Memory Multicomputer , 1993, PPSC.

[32]  Jack Dongarra,et al.  LAPACK: a portable linear algebra library for high-performance computers , 1990, SC.

[33]  William Gropp,et al.  Simplified Linear Equation Solvers users manual , 1993 .

[34]  Alan George,et al.  The Design of a User Interface for a Sparse Matrix Package , 1979, TOMS.

[35]  Wai-Mee Ching,et al.  Program Analysis and Code Generation in an APL/370 Compiler , 1986, IBM J. Res. Dev..

[36]  Steve W. Otto,et al.  MetaMP : a higher level abstraction for message- passing programming , 1991 .

[37]  Efstratios Gallopoulos,et al.  A MATLAB Compiler and Restructurer for the Development of Scientific Libraries and Applications , 1995 .

[38]  Jack Dongarra,et al.  LINPACK Users' Guide , 1987 .

[39]  Timothy Budd,et al.  An APL Compiler , 1987, Springer New York.

[40]  Rudolf Eigenmann,et al.  The range test: a dependence test for symbolic, non-linear expressions , 1994, Proceedings of Supercomputing '94.

[41]  Peter Fritzson,et al.  The Implementation of ObjectMath - a High-Level Programming Environment for Scientific Computing , 1992, CC.

[42]  Elaine Kant,et al.  Synthesis of mathematical-modeling software , 1993, IEEE Software.

[43]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[44]  J. Demmel,et al.  LAPACK: a portable linear algebra library for supercomputers , 1989, IEEE Control Systems Society Workshop on Computer-Aided Control System Design.

[45]  Rudolf Eigenmann,et al.  Polaris: A New-Generation Parallelizing Compiler for MPPs , 1993 .

[46]  C. R. Birchenhall,et al.  MatClass: A Matrix Class for C++ , 1994 .

[47]  E. M. Hartwell Boston , 1906 .

[48]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[49]  Jaeyoung Choi,et al.  PB-BLAS: a set of parallel block basic linear algebra subprograms , 1996, Concurr. Pract. Exp..

[50]  Ken Kennedy,et al.  Compiler blockability of numerical algorithms , 1992, Proceedings Supercomputing '92.

[51]  A. Malony,et al.  Implementing a parallel C++ runtime system for scalable parallel systems , 1993, Supercomputing '93.