Sorting with associative secondary storage devices

A method for sorting large files stored on disks which possess an associative search capability is described. This method, called the bucket sort algorithm, uses a sort domain histogram to exploit the associative search capability. We discuss how to establish the sort domain histogram and analyze the performance of the bucket sort algorithm. Compared to the standard merge sort algorithm, this algorithm requires at most the processing time necessary for the initial run generation and the first pass of the merge operation. It also uses no disk storage space to store temporary results. The histogram creation process is analogous to Edelberg and Schissler's gyro sort algorithm which uses special hardware to rearrange data stored in electronic memory loops. The histogram creation process is more efficient than the gyro sort algorithm when each memory loop stores a large number of records and the distribution of sort domain values is not highly irregular.