Speech resynthesis from phoneme-related parameters.
暂无分享,去创建一个
In our recent work in speech analysis and resynthesis, we have been using predictor‐derived area functions [B. S. Atal and S. L. Hanauer, “Speech Analysis and Synthesis by Linear Prediction of the Speech Wave,” J. Acoust. Soc. Am. 50, 637–655 (1971)] to describe the spectrum of the acoustic signal. It has been found that these parameters are insensitive to small changes. If the acoustic signal is indeed overspecified by these parameters, then bit reduction might be possible. Usually, bit reduction is achieved by specifying the data less often, or with less accuracy; however, we have chosen to reduce the bit rate in a way which is much more closely related to the speech‐like nature of the signal being encoded. We have found that if the boundaries of the steady‐state portion of the phonemes are found, these, as well as the transitions between the phonemes, can be represented by straight lines. This method allows for the description of the acoustic signal with two sets of area parameters per phoneme. Numerous sentences have been encoded by this method, and the resulting sentences do not sound different from the sentences from which the data were derived.