Sorting by Address Calculation

Sorting in a random access memory is essentially a process of associating the address of the location in which each item is to be placed with the identifying key of the item. The fewer times the items have to be moved from one location to another, the more efficient the sorting process. The association of memory address to key in the sorted results can be looked upon as a functional relation; for convenience this will be called the "address funct ion." Figure 1 shows a typical plot of key against memory address. The function is discontinuous and is similar to a cumulative histogram. If this function were known in advance for a particular batch of data and if it could be easily evaluated by the computer for the key of each item, all items could be inserted initially in correct memory locations without need for subsequent rearrangement of items. In most cases, however, only a statistical approximation of the address function is available prior to sorting; and even if it were known exactly, the evaluation of the exact function would usually require an inordinate amount of calculation. The probable address function can be approximated by a simple mathematical expression. This expression, which will be called the "sorting function," can be used by the computer to calculate an approximate address for the item. The computer can then use an interfiling procedure beginning at this approximate address. Thus, an item with a high key is inserted in the high portion of the allowable range of memory locations. Similarly, an i tem with a low key is inserted in the low portion of the range. For purposes of discussion, it will be assumed tha t all keys are greater than zero. Zero can then be used to indicate an empty location. For other ranges of keys, any suitable indicator could be used. To sort an item, an address is calculated from the key with the sorting function. The machine examines the contents of tha t location. If zero, the item is inserted. If not zero, the machine compares the contents of the location with the item and makes a decision as to whether to search upward or downward for the proper location in which to insert the item. Once begun, a search upward continues until either an empty location is found or an item with a larger key. When an empty location is found, the item is inserted. When an item with a larger key is found, the item is inserted in place of the larger i tem and each subsequent item moved up one by one until the first empty space is found. If searching downward is indicated by the first