Serial Correlation in the Generation of Pseudo-Random Numbers

Many practiced and proposed methods for the generation of pseudo-random numbers for use in Monte Carlo calculation can be expressed in the following way: One chooses an integer <italic>P</italic>, the base; an integer λ, the multiplier, prime to <italic>P</italic>; and an integer <italic>μ</italic>, the increment, less than <italic>P</italic> (<italic>μ</italic> is frequently, but not always, zero). One then defines recursively a sequence <italic>x</italic><subscrpt>0</subscrpt>, <italic>x</italic><subscrpt>1</subscrpt>, <italic>x</italic><subscrpt>2</subscrpt>, … (1) of integers by <italic>x</italic><subscrpt>0</subscrpt> = <italic>a</italic>, (2) <italic>x<subscrpt>n</italic>+1</subscrpt> ≡ λ<italic>x<subscrpt>n</subscrpt></italic> + <italic>μ</italic> (mod <italic>P</italic>), (3) 0 ≦ <italic>x<subscrpt>n</subscrpt></italic> < <italic>P</italic>. (4) It is clear that such a sequence is periodic with period <italic>P</italic> or less. Much work has been done on the determination of the period for various choices of <italic>P</italic>, λ and <italic>μ</italic> [1-7]. From this work one concludes that it is relatively easy to assure an adequately long period and that other considerations should determine the choice of these parameters. As for <italic>P</italic>, machine convenience dictates the choices <italic>P</italic> = 2<supscrpt><italic>q</italic></supscrpt> or <italic>P</italic> = 10<supscrpt><italic>q</italic></supscrpt> for a <italic>q</italic>-place binary or decimal machine, respectively. No other choice of <italic>P</italic> appears to have any practical advantage whatever. As for λ and <italic>μ</italic> it is the thesis of this note that they should be chosen to reduce serial correlation in the sequence (1), and we proceed to show how this may be done. We assume that λ and <italic>μ</italic> are such that the sequence (1) has an adequately long period. Then, clearly, one may assume, with error of order 1/<italic>P</italic> or less, that the sequence (1) is continuously and uniformly distributed on the interval (0, <italic>P</italic>). Then, if (&lpargt;<italic>Z</italic>⦔) denotes the mean value of <italic>Z</italic>, &lpargt;<italic>x<subscrpt>n</subscrpt></italic>⦔ = 1/<italic>P</italic> ∫<supscrpt><italic>P</italic></supscrpt><subscrpt>0</subscrpt> <italic>x dx</italic> = <italic>P</italic>/2 (5) &lpargt;<italic>x<subscrpt>n</subscrpt></italic><supscrpt>2</supscrpt>⦔ = 1/<italic>P</italic> ∫<supscrpt><italic>P</italic></supscrpt><subscrpt>0</subscrpt> <italic>x</italic><supscrpt>2</supscrpt> <italic>dx</italic> = <italic>P</italic><supscrpt>2</supscrpt>/3 (6) and Var (<italic>x<subscrpt>n</subscrpt></italic>) = <italic>P</italic><supscrpt>2</supscrpt>/3 - <italic>P</italic><supscrpt>2</supscrpt>/4 = <italic>P</italic><supscrpt>2</supscrpt>/12. (7) Further, one may write <italic>x<subscrpt>n</italic>+1</subscrpt> = <italic>P</italic>{(λ<italic>x<subscrpt>n</subscrpt></italic> + <italic>μ</italic>)<italic>P</italic><supscrpt>-1</supscrpt>}; (8) where {···} denotes “fractional part of.” (9) Also &lpargt;<italic>x<subscrpt>n</subscrpt>x<subscrpt>n</italic>+1</subscrpt>⦔ = 1/<italic>P</italic> ∫<supscrpt><italic>P</italic></supscrpt><subscrpt>0</subscrpt> <italic>xP</italic> {(λ<italic>x</italic> + <italic>μ</italic>)<italic>P</italic><supscrpt>-1</supscrpt>} <italic>dx</italic>. (10) An elementary but tedious integration yields (see Appendix) &lpargt;<italic>x<subscrpt>n</subscrpt>x<subscrpt>n</italic>+1</subscrpt>⦔ = <italic>P</italic><supscrpt>2</supscrpt>/4 + <italic>P</italic><supscrpt>2</supscrpt> - 6<italic>μ</italic>(<italic>P</italic> - <italic>μ</italic>)/12λ. (10′) Hence cov (<italic>x<subscrpt>n</subscrpt></italic>, <italic>x<subscrpt>n</italic>+1</subscrpt>) = <italic>P</italic><supscrpt>2</supscrpt> - 6<italic>μ</italic>(<italic>P</italic> - <italic>μ</italic>)/12λ, (11) and, if <italic>&rgr;</italic> (<italic>x</italic>, <italic>y</italic>) = the correlation coefficient of <italic>x</italic> and <italic>y</italic>, (12) then &rgr; (<italic>x<subscrpt>n</subscrpt></italic>, <italic>x<subscrpt>n</italic>+1</subscrpt>) = 1 - 6 <italic>μ</italic>/<italic>P</italic> (1 - <italic>μ</italic>/<italic>P</italic>)/λ. (13) One must be cautious of the conclusions one draws from (13) because of the approximate nature of its derivation. But one can say that a method which uses <italic>very</italic> small values of λ and <italic>μ</italic> is faulty. Unfortunately, such methods have been repeatedly suggested, and in one case within the author's knowledge serious difficulties were encountered in a Monte Carlo calculation because of the use of one such method. One should not lose sight of the fact that the correlation between a random number and its <italic>immediate</italic> successor is by no means the only one of importance. It is very difficult to give a precise rule, but it seems plausible that the correlation between the current random number and, at least, its next 10 or 20 successors should be controlled. Fortunately, for any such method one may write <italic>x<subscrpt>n</italic>+<italic>p</italic></subscrpt> ≡ λ<subscrpt><italic>p</italic></subscrpt><italic>x<subscrpt>n</subscrpt></italic> + <italic>μ</italic><italic>p</italic> (mod <italic>P</italic>), (14) with λ<subscrpt><italic>p</italic></subscrpt> ≡ λ<supscrpt><italic>p</italic></supscrpt> (mod <italic>P</italic>) (15) and <italic>μ</italic><subscrpt><italic>p</italic></subscrpt> ≡ λ<supscrpt><italic>p</italic></supscrpt> - 1/λ - 1 <italic>μ</italic> (mod <italic>P</italic>), (16) and again one may estimate the correlation with the use of equation (13).