A hierarchical Bayesian model for house effects in pre-electoral polls

It is widely known that pre-electoral polls often suffer from non-sampling errors, mainly due to imperfect frames, non-responses, other forms of selection bias. These non-sampling errors might also be more relevant than sampling errors. In order to correct for biases due to non-sampling errors, the pollsters implement diverse ad hoc adjustments, including expert opinions in final estimates, thus leading to the well-known house effects. We consider data on pre-election polls for Italian political elections in 2006, 2008 and 2013 carried out by various pollsters and made available on a free-access governmental website. The response variables are the estimated vote shares of the principal Italian parties (those which are included in all available polls). Since various pollsters carry out a number of pre-electoral polls in the observed time frame, the data have a hierarchical structure in which groups are defined to be the pollsters. We propose a model to assess the relevance of the variability induced by the house effects with respect to the total variability. In particular we consider a hierarchical Bayesian model in which each party share of votes is separately analyzed. The model involves two-stages, in which the first stage allows for the within-pollster variability and also includes a non linear time trend, whereas the second stage describes the between-pollsters variability, by means of a random effect vector. A Gaussian distribution is employed both for observations and for random effects. Such Gaussian distributions are parametrized so to allow for the different variability due to the various sample sizes and underlying probability. In general, results confirm a large house effect in Italian pre-election polls. The effect is quite heterogeneous between parties and between pollsters. Such conclusion appear to be quite stable across the considered years.