Valid Post-selection Inference Online Appendix

B.1 The Full Model Interpretation of Parameters. In the full model interpretation, coefficients always have the fixed meaning as full model parameters. Variable selection then means setting some coefficient estimates to zero, and these estimates always exist for all predictors, irrespective of whether they are selected or deselected. The full model interpretation of parameters is appropriate, for example, if the full model is viewed as “data generating” and the predictors are hence causal for the response, or if the full model describes a physical system where the full set of predictors is needed to capture the system fully. In such situations it is natural to consider the coefficients in the full model as targets of estimation, even though “nature” may choose to set some of them to zero. This view is meaningful for example in tomography applications where the variables constitute voxels and their coefficients are rates of absorption, hence variable selection amounts to selection of voxels with high absorption. The use of the selected voxels is for display and medical diagnosis, and there is no meaning in interpreting these voxels as constituting a submodel. If full model parameters are estimated by forcing some of them to zero and estimating the remainder via least squares, then the result is a type of shrinkage estimator for the full model parameters. Such estimators are often referred to as “preliminary test estimators”; see Saleh (2006) for a comprehensive treatment and many references. These estimators are also closely related to more recently studied “hard threshold” and “soft threshold” estimators; for a taste of the extensive literature on these and related estimators see Tsybakov (2009) and references therein. The “submodel” corresponding to non-zero parameter estimates is viewed as a computational compression and a parsimonious statistical summary of the data, but it is viewed neither as a model in its own right nor as an object of future scientific research. Inferential problems with the full model view of variable selection are pointed out by the “Vienna School” in the series of articles referenced in