Application of the Parametric Bootstrap to Models that Incorporate a Singular Value Decomposition

SUMMARY Simulation is a standard technique for investigating the sampling distribution of parameter estimators. The bootstrap is a distribution-free method of assessing sampling variability based on resampling from the empirical distribution; the parametric bootstrap resamples from a fitted parametric model. However, if the parameters of the model are constrained, and the application of these constraints is a function of the realized sample, then the resampling distribution obtained from the parametric bootstrap may become badly biased and overdispersed. Here we discuss such problems in the context of estimating parameters from a bilinear model that incorporates the singular value decomposition (SVD) and in which the parameters are identified by the standard orthogonality relationships of the SVD. Possible effects of the SVD parameter identification are arbitrary changes in the sign of singular vectors, inversion of the order of singular values and rotation of the plotted co-ordinates. This paper proposes inverse transformation or 'filtering' techniques to avoid these problems. The ideas are illustrated by assessing the variability of the location of points in a principal co-ordinates diagram and in the marginal sampling distribution of singular values. An application to the analysis of a biological data set is described. In the discussion it is pointed out that several exploratory multivariate methods may benefit by using resampling with filtering.

[1]  Wojtek J. Krzanowski,et al.  Principles of multivariate analysis : a user's perspective. oxford , 1988 .

[2]  Leo A. Goodman,et al.  Some Useful Extensions of the Usual Correspondence Analysis Approach and the Usual Log-Linear Models Approach in the Analysis of Contingency Tables , 1986 .

[3]  Trevor J. Ringrose,et al.  Bootstrapping and correspondence analysis in archaeology , 1992 .

[4]  Bernard W. Silverman,et al.  Constructing the Convex Hull of a Set of Points in the Plane , 1979, Comput. J..

[5]  G. Stewart Introduction to matrix computations , 1973 .

[6]  Thoddi C. T. Kotiah Least squares approximation of a function , 1996 .

[7]  Gene H. Golub,et al.  Matrix computations , 1983 .

[8]  R. Sibson Studies in the Robustness of Multidimensional Scaling: Procrustes Statistics , 1978 .

[9]  Wei-Liem Loh Estimating Covariance Matrices , 1991 .

[10]  L. A. Goodman The Analysis of Cross-Classified Data Having Ordered and/or Unordered Categories: Association Models, Correlation Models, and Asymmetry Models for Contingency Tables With or Without Missing Entries , 1985 .

[11]  Bradley P. Carlin,et al.  A Sample Reuse Method for Accurate Parametric Empirical Bayes Confidence Intervals , 1991 .

[12]  W. J. Krzanowski,et al.  Nonparametric Confidence and Tolerance Regions in Canonical Variate Analysis , 1989 .

[13]  John C. Gower,et al.  Statistical methods of comparing different multivariate analyses of the same data , 1971 .

[14]  P. G. N. Digby,et al.  Multivariate Analysis of Ecological Communities , 1987 .

[15]  James R. Schott,et al.  Canonical mean projections and confidence regions in canonical variate analysis , 1990 .

[16]  Wojtek J. Krzanowski,et al.  On confidence regions in canonical variate analysis , 1989 .

[17]  K. Gabriel,et al.  Least Squares Approximation of Matrices by Additive and Multiplicative Models , 1978 .

[18]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[19]  Charles J. Geyer,et al.  Constrained Maximum Likelihood Exemplified by Isotonic Convex Logistic Regression , 1991 .

[20]  Allan R. Wilks,et al.  The new S language: a programming environment for data analysis and graphics , 1988 .

[21]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[22]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[23]  Simulation study of confidence regions for canonical variate analysis , 1991 .