Role of prosody in cognitive process of spoken language

The authors are developing an arbitrary text speech synthesis system, and aiming at the improvement of the synthesized speech quality, the effect of the prosodic information data of the speech, e.g., pitch structure and the time structure, on the speech intelligibility was evaluated by a listening experiment using synthesized speech. It was verified quantitatively by the result of experiment that the pitch information works efficiently in the speech listening without attention concentration, aiding the understanding of the speech. It was seen that the accent information is especially important, so that the accent should be attached to the text speech synthesis system resulting in an accuracy as high as 98 percent or more. It was shown that the existence or nonexistence of the phrase component does not affect speech intelligibility. Some results of the experiment suggest a direction toward improving the performance of the speech recognition system. It was seen that if the phrase recognition rate of 70 percent is achieved in the speech recognition system from the acoustic viewpoint, the rate can be improved up to 90 percent or more without using the pitch information by employing the lower- and higher-level linguistic knowledge' processing. The importance of these factors in improving the intelligibility are in the order of lexicon, context, and accent information.