Statistical modelling

First note that in each of the examples above, it would be hopeless to attempt to find a deterministic function that gives the response for every possible set of values of the covariates. Instead, it makes sense to think of the data-generating mechanism as being inherently random, with perhaps a deterministic function relating average values of the responses to values of the covariates. We model the responses yi as realisations of random variables Yi. Depending on how the data were collected, it may seem appropriate to also treat the xi as random. However, in such cases we usually condition on the observed values of the explanatory variables. To aid intuition, it may help to imagine a hypothetical sequence of repetitions of the ‘experiment’ that was conducted to produce the data with the xi, i = 1, . . . , n held fixed, and think of the dataset at hand as being one of the many elements of such a sequence. In the course Principles of Statistics, theory was developed for data that were i.i.d. In our setting here, this assumption is not appropriate: the distributions of Yi and Yj may well be different is xi 6= xj . In fact what we are interested in is how the distributions of the Yi differ. However, we will still usually assume that the data are at least independent. It turns out that with this assumption of independence, much of the theory from Principles of Statistics can be applied, with little modification. In this course we will study some of the most popular and important statistical models for data of the form (0.0.1). We begin with the linear model, which you will have met in Statistics IB.