Penalized least squares versus generalized least squares representations of linear mixed models

The methods in the lme4 package for R for fitting linear mixed models are based on sparse matrix methods, especially the Cholesky decomposition of sparse positive-semidefinite matrices, in a penalized least squares representation of the conditional model for the response given the random effects. The representation is similar to that in Henderson’s mixed-model equations. An alternative representation of the calculations is as a generalized least squares problem. We describe the two representations, show the equivalence of the two representations and explain why we feel that the penalized least squares approach is more versatile and more computationally efficient. 1 Definition of the model We consider linear mixed models in which the random effects are represented by a q-dimensional random vector, B, and the response is represented by an n-dimensional random vector, Y . We observe a value, y, of the response. The random effects are unobserved. For our purposes, we will assume a “spherical”multivariate normal conditional distribution of Y , given B. That is, we assume the variance-covariance matrix of Y |B is simply σ2In, where In denotes the identity matrix of order n. (The term “spherical” refers to the fact that contours of the conditional density are concentric spheres.)