An approach to the segmentation problem in speech analysis and language translation

THE generation of proper word boundaries is an important part of several problems in information processing. Specifically, the speech recognition problem is often described as the production of a phonemic transcript, followed by the assembly of phonemes into complete words.1,2,3,4 The automatic translation of certain natural or artificial languages, such as, for example, Chinese and Japanese to English,5,6,7 or English to Braille8 also requires the generation of words in the output language which may correspond either to several items of input, or to only part of an input item. The segmentation problem is often complicated by the fact that each item of input may be associated with several possible output correspondents, only one of which is acceptable in any given context. Frequently, the reduction of each set of multiple correspondents is at least partly dependent upon the proper recognition of word boundaries. The English phoneme sequence/aban/ might, for example, correspond to the indefinite article "a" followed by the noun "ban", or it might form a verb or noun prefix as in "abandon", or "abandonment". Similarly, the Chinese character (dzi), which may be translated as "self" when standing alone, may in combination with other characters be translated variously as "freedom", "self-defence", "ego", "originality", "naturally", "freely", "liberalism", and so on. The generation of syntactically well-formed sentences in the output language is a common requirement for the set of problems under consideration. Since the material being processed does not, however, consist of complete syntactic units, it is first necessary to generate the appropriate structural information before any method based on syntax can be used. Two principal techniques are therefore proposed for the recognition of