Information storage and retrieval
暂无分享,去创建一个
The letter and/or sound combinations that make up a human language are limited by the human's ability to pronounce tnese sounds° Therefore, the standard library search, which as a rule looks for all possible combinations of letters to find a word, is wasteful. Certain letters simply cannot be followed by certain other letters and a search for them is senseless. Following this same line of reasoning, letters very frequently occur in the combinations that are germane to the particular language. The growing amount of alphanumeric information presently being stored on magnetic tape presents increasingly difficult problems in both the number of tape reels used and the time necessary to search this mass of information in order to extract pertinent literature. At the present time most of this literature on tape utilizes the standard IBM 6-bit code to express alphanumeric symbols. ~t is entirely feasible to record standard English literature on tape -be it professional abstracts or novels -using only approximately two-thirds of the binary bits utilized to represent the same piece of written material in the conventional code. This can be accomplished by setting up, in a 9-bit code, the 400-odd letter combinations occurring most frequently. A 9-bit representation allows the programmer to set up as many as 512 symbols, thus leaving sufficient leeway to assign symbols to the most frequentlyused words, mathematical symbols, professional expressions, that are expected to be encountered in the literature to be recorded. In addition, these relatively short 9-bit symbols can be assigned to all key words that it may be necessary to look for later, thereby accelerating any future library search.