Gentleman (1974) gave some routines (algorithm AS 75) in Algol for linear least squares calculations; Griffiths and Hill (1985) contains a translation into Fortran. They do not include routines for changing the order of variables, for handling singularities, for calculating correlations or partial correlations from the Cholesky factorization, for calculating regression coefficients for a subset of the variables or for calculating the estimated covariances of such regression coefficients. Clarke (1981) provided an algorithm for changing the order of variables, but it requires the elements of the Cholesky factorization to be stored in a different order from Gentleman's, and it overwrites the implicit ls on the diagonal of the upper triangular factor with the row multipliers. The Cholesky factor is represented here as D112R where D is a diagonal matrix of row multipliers and R is an upper triangular matrix with ls on its diagonal. In the Fortran code, the diagonal of D is stored in arrayD and R is stored by rows, excluding its diagonal elements, in the one-dimensional array RBAR. This is exactly the same as in algorithm AS 75. This set of routines can be used with orthogonal reductions produced by using Gentleman's routines, or on their own. The basic algorithm used for updating or changing the order of variables is that used by Gentleman, though the present author has used the so-called fast planar rotation method described by Hammarling (1974) for many years. The Hammarling algorithm is faster for large numbers of variables, but its diagonal multipliers need fairly frequent rescaling, and this produces longer code and partially destroys the speed advantage. For compatibility with algorithm AS 75, the Hammarling algorithm has not been used here. The algorithm used by Gentleman is not well suited when there are more variables than observations. For instance, in near infra-red spectroscopy it is common to have say 500 variables (wavelengths) but only say 30 cases. Algorithm AS 75 requires the storage of the whole upper triangle of 124 750 elements, rather than just the first 30 rows of non-zero values which requires about 15 000 storage locations. If the user is confident that there are no linear dependencies among any 30 of the variables, then
[1]
S. Hammarling.
A Note on Modifications to the Givens Plane Rotation
,
1974
.
[2]
W. Gentleman.
Error analysis of QR decompositions by Givens transformations
,
1975
.
[3]
G. Stewart,et al.
Reorthogonalization and stable algorithms for updating the Gram-Schmidt QR factorization
,
1976
.
[4]
R. Farebrother.
Linear least squares computations
,
1988
.
[5]
M. R. B. Clarke,et al.
A Givens Algorithm for Moving from One Linear Model to Another Without Going Back to the Data
,
1981
.
[6]
R. W. Farebrother.
Remark AS R17: Recursive Residuals-A Remark on Algorithm AS 75: Basic Procedures for Large, Sparse or Weighted Linear Least Squares Problems
,
1976
.
[7]
I. D. Hill,et al.
Applied Statistics Algorithms.
,
1985
.
[8]
W. Morven Gentleman,et al.
Basic Procedures for Large, Sparse or Weighted Linear Least Squares Problems
,
1974
.
[9]
G. Stewart.
Introduction to matrix computations
,
1973
.
[10]
David A. Belsley,et al.
Regression Analysis and its Application: A Data-Oriented Approach.@@@Applied Linear Regression.@@@Regression Diagnostics: Identifying Influential Data and Sources of Collinearity
,
1981
.