Effect of Learning on Listening to Ultra-Fast Synthesized Speech

A text-to-speech synthesizer that would produce easily understandable voices at very fast speaking rates is expected to help persons with visual disability to acquire information effectively with screen reading softwares. We investigated the intelligibility of Japanese text-to-speech systems at fast speaking rates, using four-digit random numbers as the vocabulary of the recall test. We also studied the fast and intelligible text-to-speech engine, using HMM-based synthesizer with the corpus with fast speaking rate. As the results, the statistical models trained with the fast speaking corpus was effective. The learning effect was significant in the early stage of the trials and the effect sustained for several weeks