Selection of main effects

Model specification is the most difficult part of prediction modeling. Especially in smaller data sets, it is virtually impossible to obtain a reliable answer to the question: which predictors are important and which are not? In this chapter, we focus on the problems that are associated with model reduction techniques such as stepwise selection, including overfitting and the quality of predictions from a model. Specific issues include instability of selection, biased estimation of coefficients and exaggeration of p-values. Alternative approaches are discussed, such as limiting the number of candidate predictors, e.g. based on a meta-analysis of available literature, and some modern selection methods, such as the LASSO and elastic net.