论文信息 - Median split trees: a fast lookup technique for frequently occuring keys

Median split trees: a fast lookup technique for frequently occuring keys

Split trees are a new technique for searching sets of keys with highly skewed frequency distributions. A split tree is a binary search tree each node of which contains two key values—a <italic>node</italic> value which is a maximally frequent key in that subtree, and a <italic>split</italic> value which partitions the remaining keys (with respect to their lexical ordering) between the left and right subtrees. A <italic>median</italic> split tree (MST) uses the lexical median of a node's descendents as its split value to force the search tree to be perfectly balanced, achieving both a space efficient representation of the tree and high search speed. Unlike frequency ordered binary search trees, the cost of a successful search of an MST is log <italic>n</italic> bounded and very stable around minimal values. Further, an MST can be built for a given key ordering and set of frequencies in time <italic>n</italic> log <italic>n</italic>, as opposed to <italic>n</italic><supscrpt>2</supscrpt> for an optimum binary search tree. A discussion of the application of MST's to dictionary lookup for English is presented, and the performance obtained is contrasted with that of other techniques.

B. A. Sheil | B. Sheil

[1] Renzo Sprugnoli,et al. Perfect hashing functions , 1977, Commun. ACM.

[2] E. Crook,et al. Word Recognition , 2010 .

[3] Daniel S. Hirschberg,et al. An insertion technique for one-sided height-balanced trees , 1976, CACM.

[4] Donald Ervin Knuth,et al. The Art of Computer Programming , 1968 .

[5] Manuel Blum,et al. Time Bounds for Selection , 1973, J. Comput. Syst. Sci..

[6] Donald E. Knuth. The art of computer programming: fundamental algorithms , 1969 .

[7] Kurt Maly. Compressed tries , 1976, CACM.

[8] George Kingsley Zipf,et al. Human behavior and the principle of least effort , 1949 .

[9] H. Kucera,et al. Computational analysis of present-day American English , 1967 .