Parallel variable-band Choleski solvers for computational structural analysis applications on vector multiprocessor supercomputers

Abstract A Choleski method used to solve linear systems of equations that arise in large scale structural analyses is described. The method uses a novel variable-band stroage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is also used for two different parallel implementations, demonstrating the use of CRAY macrotasking. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both the CRAY-2 and CRAY Y-MP computers. CPU and wall clock timings are given for the various parallel methods and are compared to single processor timings of the same algorithm. Computation rates over 1 GIGAFLOP (1 billion floating point operations per second) on a four processor CRAY-2 and over 2 GIGAFLOPS on an eight processor CRAY Y-MP are demonstrated as measured by wall clock time in a dedicated environment. Reduced wall clock times for the parallel methods relative to the single processor implementation of the same Choleski algorithm are also demonstrated for runs made in multi-user mode.

[1]  Gordon C Everstine The BANDIT Computer Program for the Reduction of Matrix Bandwidth for NASTRAN , 1972 .

[2]  J. Ortega Introduction to Parallel and Vector Solution of Linear Systems , 1988, Frontiers of Computer Science.

[3]  Eugene L. Poole,et al.  Efficient multitasking of Choleski matrix factorization on CRAY supercomputers , 1991 .

[4]  Duc T. Nguyen,et al.  A Parallel-Vector Algorithm for Rapid Structural Analysis on High-Performance Computers , 1990 .

[5]  Harry F. Jordan,et al.  Comparing barrier algorithms , 1989, Parallel Comput..

[6]  E. L. Poole,et al.  high-performance equation solvers and their impact on finite element analysis , 1992 .

[7]  Norman F. Knight,et al.  CSM Testbed Development and Large-Scale Structural Applications , 1989 .

[8]  Pi-Jen Kao,et al.  Comparison of equivalent plate and finite element analysis of a realistic aircraft structural configuration , 1990 .

[9]  Michael P. Nemeth,et al.  Preliminary 2-D shell analysis of the space shuttle solid rocket boosters , 1987 .

[10]  E. L. Poole,et al.  The solution of linear systems of equations with a structural analysis code on the NAS CRAY-2 , 1988 .

[11]  Harry F. Jordan,et al.  Programming language concepts for multiprocessors , 1988, Parallel Comput..

[12]  J. M. Ortega,et al.  The ijk forms of factorization methods I. Vector computers , 1988, Parallel Comput..

[13]  Susan L. Mccleary,et al.  Large-scale structural analysis: The structural analyst, the CSM Testbed and the NAS System , 1989 .

[14]  Mark T. Jones,et al.  The Use of Lanczo's Method to Solve the Large Generalized Symmetric Eigenvalue Problem in Parallel , 1990 .

[15]  A W Robins,et al.  Concept development of a Mach 3.0 high-speed civil transport , 1988 .