Letters to the editor: three letters on merging

Dear Editor: It is important to clear up two commonly misunderstood points about tape merging. In the "glossary of Sorting and following is stated regarding the polyphase merge: "The effective power of the merge varies between T-1 and T/2, depending on the amount of input data and the number of strings." For the cascade merge, we read: "The effective power of the merge varies between T-1 and T/2, but in all cases is less than the power of the Polyphase Merge." Both of these statements are greatly in error, and they must have been made without any logical foundation , The writer of this letter has carried out extensive theoretical calculations regarding the "effective power" of these two merge patterns, and computer simulation of the algorithms reveals perfect agreement with the theory. The details of the computations are too lengthy to be included here (they will eventually be published elsewhere), but since they have been confirmed by simulation there is no doubt as to their validity. Suppose we have a cascade merge process [B. K. Betz and W. C. Carter, ACM Nat'l Meeting, ]959] which uses at most a k-way merge. If there are S initial strings produced by the first pass, the total number of passes over the data (including this first pass) is given very closely by the formula uk In S + vk, where uk and vk are constants; hence the effective power of the merge is S I" 1/(uk In S + vk-1) and as S becomes large this is given approximately by e 1/~k. Values of uk and vk are tabulated below; the effective power of the merge for the cascade algorithm has the approximate form 2k/Tr + .321 (where the constant 2/~-is known from theoretical considerations, while the constant .32l is an empirical estimate). This formula is quite accurate for k ~ 3. In the polyphase merge JR. L. Gilstad, EJCC, 1960] with the same assumptions, the total number of times each string is read during the process is given approximately by p~S in S q-qkS, hence the effective number of passes is pk in S + %. As above, the effective power is e ~/pk, as S becomes large; and in the case of the polyphase merge this quantity rapidly approaches 4 as k gets larger. (The effective number of passes is quite accurately given by the formula log4 (S/(k-1)) + 2, 10 …