Understanding Statistics by Using Spreadsheets to Generate and Analyze Samples

The random number and probability distribution functions in Excel allow the user to easily generate samples that simulate data typical of any kind of biomedical study. The act of generating the samples should provide the user with an implicit understanding of fundamental statistical concepts, including variables, probability, independence, sampling variation, linear modeling, random error, fixed effects, random effects, and individual responses. Analysis of the samples, which is essentially an attempt to recover the formulae that generated the samples, should reinforce these concepts and develop others related to statistical inference, including bias, confidence limits, statistical significance, and chances of benefit and harm. The spreadsheets accompanying this article provide examples of generation and analysis of data for reliability and validity studies and for simple and covariate-adjusted comparisons of group means without and with repeated measurement. An example is also given for generation of a binary variable for data simulating events, such as the occurrence of injuries, but the analysis by generalized linear modeling is currently not available in these spreadsheets.