Confidence intervals in discrete event simulation: A comparison of replication and batch means

Suppose that we have enough computer time to make n observations of a stochastic process by means of simulation and would like to construct a confidence interval for the steady-state mean. We can make k independent runs of m observations each (n=k.m) or, alternatively, one run of n observations which we then divide into k batches of length m. These methods are known as replication and batch means, respectively. In this paper, using the probability of coverage and the half length of a confidence interval as criteria for comparison, we empirically show that batch means is superior to replication, but that neither method works well if n is too small. We also show that if m is chosen too small for replication, then the coverage may decrease dramatically as the total sample size n is increased.