Programming Technique: An improved hash code for scatter storage

Although scatter storage tables are used widely in system programming, they are subject to various drawbacks. One of these is that the size of the table cannot be arbitrary, but is restricted to powers of 2 by the hash coding method. In this note we present a new hash coding method that, besides being very simple and as fast as the best known methods, allows the table size to be almost any prime number. The scatter storage techniques currently used in assemblers, compilers, and elsewhere, are excellently summarized in [1]. Items are entered into a table using an index which is computed from the item by means of some hash coding method. As tong as no two inserted items have the same hash code, searching and insertion are each performed in a single step, regardless of the size of the table. When two items have the same hash code, a collision is said to exist. In this ease the second item must be put out of place in the table. This takes extra time; but if the hash codes are randomly distributed, the average number of steps is less than 2 even for a table which is 75 percent full. The usual hash coding methods involve the calculation of a k-bit field which is assumed to be a random integer between 0 and 2 ~ -1. Thus the table size is restricted to