Voice morphing by 3-D waveform interpolation surface and lossless tube area function

Voice morphing is the process of gradually transforming the voice of a given speaker to that of another. The ability to change the speaker's individual characteristics and produce high-quality voices can be used in many applications. For example, in multimedia and video entertainment, voice morphing is just like its visual counterpart: while seeing a face gradually changing from one person's to another's, we can simultaneously hear the voice changing as well. Another application could be in forensic voice identification: creating a voice-bank of different pitches, rates, and timbres, to assist in recognition of the suspect's voice. In this study we present a new technique, which enables the production of N intermediate voices that gradually change between voices of two speakers, or one voice signal that changes gradually. This technique is based on two components. One is creating a 3D prototype waveform interpolation (PWI) surface from the residual error ' signal, which is estimated from LPC analysis, to produce a new intermediate excitation signal. The second component is a representation of the vocal tract by a lossless tube area function, and interpolation of the two speakers' parameters.

[1]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[2]  Yoshinori Sagisaka,et al.  Acoustic characteristics of speaker individuality: Control and conversion , 1995, Speech Commun..

[3]  D. Malah,et al.  Low bit-rate speech coder based on a long-term model , 2002, The 22nd Convention on Electrical and Electronics Engineers in Israel, 2002..

[4]  John H. L. Hansen,et al.  Speech Coding and Synthesis , 2000 .