Monte Carlo errors with less errors

We explain in detail how to estimate mean values and assess statistical errors for arbitrary functions of elementary observables in Monte Carlo simulations. The method is to estimate and sum the relevant autocorrelation functions, which is argued to produce more certain error estimates than binning techniques and hence to help toward a better exploitation of expensive simulations. An effective integrated autocorrelation time is computed which is suitable to benchmark efficiencies of simulation algorithms with regard to specific observables of interest. A Matlab code is offered for download that implements the method. It can also combine independent runs (replica) allowing to judge their consistency. HU-EP-03/32 SFB/CCP-03-12 ∗e-mail: uwolff@physik.hu-berlin.de 1 Formulation of the problem In this article we collect some theory and formulas for the practical estimation of statistical and systematic errors in Monte Carlo simulations. Emphasis is put on the estimation of in general nonlinear functions (‘derived quantities’) of the primary expectation values. The strategy focuses on the explicit determination of the relevant autocorrelation functions and times. We shall discuss why this is advantageous compared to the popular binning techniques, which handle autocorrelations only implicitly. A Matlab code is made available, which implements the method described here. The material given here is only partially new. It has been briefly discussed in the appendix of [1] and in internal notes of the ALPHA collaboration [2] and in lecture notes [3]. However, due to the fact that mastering this topic is an indispensable prerequisite for every student or researcher in the popular field of lattice simulation, we felt that there is a gap in the readily citeable literature which we want to fill here. We assume to have a number of primary observables enumerated by a Greek index with exact statistical mean values Aα. The object to be estimated is a function of these, F ≡ f(A1, A2, . . .) ≡ f(Aα). (1) A typical example is given by reweighting, where a quotient F = A1/A2 has to be determined. A more involved case are fit-parameters determined by correlation functions over a range of separations. For the estimation we employ Monte Carlo estimates of the primary observables given by aα. For each observable there are i = 1, 2, . . . , N successive estimates separated by executions of some valid update procedure. We assume that the Markov chain has been equilibrated before recording data beginning with i = 1. This has to be secured from case to case, and we in particular recommend visual inspections of initial histories of observables to decide on safely discarding data. A posteriori one should also check that the equilibration time was much longer than autocorrelation times found in the analysis of the supposed equilibrium data. A key theoretical quantity for error estimation is the autocorrelation [4, 5] function Γαβ defined by 〈 (aα − Aα)(a β − Aβ) 〉